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Preface 



It has been 8 years since the last volumes on yeast genetics appeared in 
Methods in Enzymology (Guide to Yeast Genetics and Molecular and Cell 
Biology; Volumes 350, 351). At that time, Saccharomyces cerevisiae was 
already acknowledged to be the most advanced system for the exploration 
of basic questions in cell biology. The existence of a small, well-annotated 
genome, together with simple and facile genetics and a wealth of functional 
tools, was propelling new discoveries at an unprecedented rate. Predictably, 
the number of completely uncharacterized genes has since dwindled from 
roughly one-third to just a few handfuls. Harder to foresee then, however, 
was how these new tools would qualitatively change the way one could 
attack biological questions, rapidly ushering in the "postgenomic" era. 
Suddenly the experimental world became virtually finite. Interested in 
how the ER makes a particular lipid? One no longer had to carry out an 
open-ended search hoping to find a few key players and bootstrap one's way 
through the pathway. Instead, one could immediately focus on the system- 
atic exploration of several hundred genes whose products localize to that 
organelle. This targeted exploration could in turn be enormously facilitated 
by the ready availability of comprehensive collections of null or compro- 
mised alleles and tagged genes, exploiting rapid methods for genetic crosses 
and quantitative phenotypic characterization of the candidates. Of compa- 
rable impact is the ability to then transition from a function-centered point 
of view (i.e., the identification of players important for a cellular process of 
interest) to one in which the full sets of processes that a given gene impacts 
can be systematically explored, derived from the functional and physical 
interactions with other gene products. This new paradigm has already 
pointed to many novel and unanticipated cellular functions. 

In the present volume, we have documented many of the major experi- 
mental and analytical advances that have catalyzed this change. Additionally, 
we have highlighted a variety of powerful improvements in biochemical 
and cytological approaches. Finally, the volume concludes with basic 
primers on other yeasts, including Schizo saccharomyces pombe as well as the 
clinically relevant Cryptococcus neoformans and Candida albicans. Yeast is the 
best characterized eukaryotic cell and will likely remain so for the foresee- 
able future. The next exciting frontier to consider is how the availability of a 
complete wiring diagram will inform our understanding of how the cell 
functions as an integrated machine. 
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Abstract 

This chapter provides a guide to analyzing gene function using DNA micro- 
arrays. First, I discuss the design and interpretation of experiments where gene 
expression levels in mutant and wild-type strains are compared. I then provide a 
detailed description of the protocols for isolating mRNA from yeast cells, 
converting the RNA into dye-labeled cDNA, and hybridizing these samples to 
a microarray. Finally, I discuss methods for washing, scanning, and analyzing 
the arrays. Emphasis is placed on describing approaches and techniques that 
help to minimize the artifacts and noise that so often plague microarray data. 
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1. Introduction and Experimental Design 

DNA microarrays are a powerful tool for studying the function of 
signaling proteins, transcription factors, and the networks that they com- 
prise. The basic premise of the approach is simple; by analyzing the global 
gene expression profile of mutant and wild-type (WT) strains it should be 
possible to deduce the function of any protein or genomic element per- 
turbed. In practice, however, the design, execution, and interpretation of 
these experiments require strict attention to detail. This is particularly true if 
the data is to be interpreted at a quantitative level. 

1.1. Single-mutant analysis 

Perhaps the most straightforward approach to interrogating gene function 
using microarrays is to compare the mRNA levels in a strain with a gene 
deleted to those in the WT parental strain. This approach has now been 
used to analyze the function of hundreds of Saccharomyces cerevisiae genes 
leading to genome-wide maps of signaling pathways, cell cycle control, and 
the DNA damage response (Hughes et ah, 2000; Ideker et ah, 2001; Roberts 
et ah, 2000; Workman et ah, 2006). This approach has also been applied to 
other yeast species such as Candida albicans and Saccharomyces pombe, leading 
to interesting insights into the evolution of signaling systems and the 
molecular determinants of pathogenicity (Enjalbert et ah, 2006; Smith 
et ah, 2002; Tsong et ah, 2006; Tuch et ah, 2008). The difficulty with 
such experiments, however, is that gene deletion often leads to secondary 
changes that make the microarray results difficult to interpret. For example, 
if removing the gene for one transcription factor also leads to changes in the 
expression of several other transcription factors, it will not be possible to 
draw simple conclusions about the genes regulated by the factor that is 
knocked out. Secondary effects can also be created through downstream 
changes in metabolite concentrations or the activity of signaling molecules, 
and are therefore a potential problem in almost all mutant strains. Compu- 
tational approaches can be used to dissect out the direct and indirect effects 
of such deletions (Workman et ah, 2006) but, in general, this deconvolution 
is difficult to achieve. 

To avoid confounding secondary effects, microarray experiments should 
therefore be designed around the conditional removal of gene function. In 
many cases, this can be achieved simply by growing the cells in conditions 
where the gene product is inactive and then probing mRNA levels shortly 
after activation by appropriate stimuli. For example, in a recent study of the 
Hogl network, array analysis showed that deletion of network components 
had little or no effect on gene expression in standard growth conditions 
(Capaldi et ah, 2008). However, when the same strains were examined 
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shortly after osmotic stress (10—20 min later), dramatic changes were 
observed in the expression of hundreds of genes. Correlation with 
genome-wide transcription factor binding data from chromatin immuno- 
precipitation (ChIP; described in detail in Chapter 4 of this volume) showed 
that these effects are direct. By contrast, when transcription factor binding 
data was compared to expression data collected an hour after exposure to 
osmotic stress, the correlations were weak, demonstrating a buildup of 
secondary effects. 

In those instances where proteins are constitutively active, alternative 
approaches can be taken to avoid or minimize secondary effects. For 
example, several studies have taken advantage of analog-sensitive kinase 
alleles to conditionally block kinase activity (Carroll et ah, 2001; Papa et ah, 
2003; Zaman et ah, 2009). Other approaches to conditional perturbation 
include utilization of inducible promoters, temperature-sensitive alleles, or 
drug-regulated protein degradation. In cases where conditional perturba- 
tions are not possible, secondary effects need to be kept in mind and analysis 
limited to global effects and correlations. 

A highly related approach to examining gene function using DNA 
microarrays is to measure gene expression in strains with mutations that 
eliminate sites of posttranslational modification or modify DNA binding 
sites. Such studies have been used to build detailed models of regulatory 
mechanisms (Leber et ah, 2004; Springer et ah, 2003; Wang et ah, 2004), but 
the same caveats apply. If the mutation leads to constitutive changes or the 
expression changes are examined long after activating conditions are 
applied, secondary effects can complicate the interpretation. 

In either gene deletion or gene mutation experiments, the resulting data 
can be examined at several levels. First, it is usually possible to gain insight 
into the function of the element perturbed by examining the biological role 
of the genes up- or downregulated in the mutant; for example, by looking 
for gene ontology (GO) terms enriched in the regulated gene-set (Chu 
et ah, 1998; DeRisi et ah, 1997). Second, it is often possible to identify the 
pathway(s) and/or proteins that are affected by your mutation by looking 
for correlations with previous datasets. As proteins in the same or interacting 
pathways regulate highly overlapping gene-sets, comparison with previ- 
ously acquired KO data (e.g., the Hughes compendium; Hughes et ah, 
2000) can be highly informative (Marion et ah, 2004; Segal et ah, 2003). It 
is also possible to indentify transcription factors regulated by, or that interact 
with, the gene under study by looking for significant overlap with target 
genes identified using genome-wide ChIP analysis (Bar-Joseph et ah, 2003). 
Finally, the transcription factors regulated directly or indirectly by the 
genetic element under investigation can be identified using motif analysis 
(Beer and Tavazoie, 2004; Roth et ah, 1998; Wang et ah, 2005). Further 
details of the computational methods used to perform such analyses are 
provided in Chapter 2 of this volume. 
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1.2. Double-mutant analysis 

While much can be gained by analyzing a strain with a single mutation/ 
deletion under a single condition, the full power of microarray analysis is 
only realized when multiple related experiments are compared. By exam- 
ining expression in a range of conditions it is possible to determine when a 
protein or pathway of interest is activated and how its function depends on 
signal level and/ or type. Furthermore, by examining expression in a variety 
of strains, a quantitative genome-wide interaction map can be constructed. 
This approach is particularly powerful when double mutants are examined, 
as described below. 

A direct or indirect interaction between two factors, A and B, can be 
inferred when microarray data reveal a significant overlap in the gene-sets 
that A and B regulate. However, such data cannot be used to determine if 
the factors act independently, cooperatively, or partially cooperatively to 
regulate these genes (Fig. 1.1 A). To complicate matters further the interac- 
tion type may vary from gene to gene. Therefore, to distinguish between 
these mechanisms, gene expression data in the double-mutant strain 
must also be examined. If the factors A and B act cooperatively to regulate 
a gene, the expression defect in the single- (AA and BA) and double 
(AABA)-mutant strains will be identical (Fig. 1.1B, middle panel). By 
contrast, if the factors act independently, the defect in the double mutant 
will be the sum of the defects in the single mutants (Fig. 1.1B, top panel). In 
cases where the interaction is partially cooperative, the expression defect 
in the double mutant will be somewhere in-between the values expected 
for a fully independent or fully cooperative interaction (Fig. 1.1B, bottom 
panel). 

The strength of double-mutant analysis is the ability to determine how 
the interaction of any two proteins (or mutations) affects each and every 
gene in the genome. This analysis not only makes it possible to build 
detailed network or circuit diagrams but also provides clues as to the precise 
mechanism of the identified interaction. For example, where all or most of 
the genes regulated by factors A and B are influenced by a cooperative 
interaction, it is highly likely that the interaction occurs at the signaling 
level. By contrast, where only a subset of genes depends on a cooperative 
interaction, it is more likely that the interaction occurs at the level of 
transcription factor activation or through another downstream affector. 
Importantly, however, the ability to build such detailed models is limited 
by the noise in the microarray analysis; noise that is compounded when 
single and double mutants are compared. Therefore, multiple measurements 
need to be made and statistics applied to distinguish between the possible 
regulatory mechanisms (Fig. 1.1) at each gene. To allow such error analysis, 
it is critical to first break down the data describing the interaction at each 
gene into its fundamental components. Following the example in Fig. 1.1, 
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Figure 1.1 Single- and double-mutant analysis of gene expression. (A) Venn diagram 
summarizing the overlap in genes with a significant defect in gene expression due to 
deletion of gene A (AA) or gene B (BA). The wiring diagrams indicate the possible ways 
factors A and B can interact to regulate expression of overlapping sets of genes. (B) 
Schematic illustrating the application of the double-mutant approach to analyzing 
transcriptional network structure and function. The bar graphs on the left show the 
defects expected in AA, BA, and AABA strains for each of the three sample mechanisms. 
The bar graphs on the right show the values of the three expression components (A, B, 
and Co) determined by fitting the expression data for the AA, BA, and AABA strains 
(see the text for details) . 



these components are the induction/repression from A alone, the induc- 
tion/repression from B alone, and the influence that the interaction 
between A and B has on expression (or the cooperative component; 
Fig. 1.1B). The values of these components can be determined simply by 
comparing the expression defects in microarrays examining the single and 
double mutants. The array comparing expression in AA to the WT reports 
the value of A + Co for each gene, the array comparing expression in BA 
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to the WT reports the value of B + Co for each gene, while the data for the 
double mutant AABA compared to the WT reports the value of A + B + 
Co. In this case, the error for each expression component for each gene can 
be estimated by propagating the errors through the calculation and using a 
simple t-test (using log expression values). Once this is done it is possible to 
cluster data based on the type of interaction (pattern of significant A, B, and 
Co components) and identify groups of genes regulated by a common 
mechanism. The double-mutant approach, its application, and the analysis 
of the associated errors are described in more detail in Capaldi et al. (2008). 
Statistical analysis using these or related methods can be carried out using the 
free software package R (http://www.r-project.org) or MATLAB (http:// 
www.mathworks.com/). 




2. Methods 

Once an experiment or series of experiments examining gene function 
is outlined, it must be translated into a detailed procedure designed to 
measure mRNA levels while limiting noise (biological or otherwise) in 
the data. In all cases this means using two color DNA micro arrays. 
The outline of the procedure (developed by DeRisi et ah, 1997) is as follows: 
Cells are grown under the appropriate conditions and mRNA is extracted 
and purified. The mRNA is then converted into cDNA using reverse 
transcription and labeled with one of two fluorophores (Cy3 or Cy5). 
Two cDNA samples, an experimental sample labeled with Cy5 and a control 
labeled with Cy3, are then hybridized to an array consisting of thousands of 
different DNA fragments spotted onto a glass slide, where each fragment is 
complimentary to a single gene. These arrays are then washed to remove any 
cDNA that binds nonspecifically and analyzed using a laser scanner to 
measure the Cy5 and Cy3 fluorescence. Finally, after data normalization, 
the ratio of Cy5 to Cy3 fluorescence at each spot is calculated and used to 
determine the difference in the mRNA expression levels in the two samples. 



2.1. Experimental design 

As each microarray compares the expression levels in the two samples, the 
first step in designing a microarray procedure is deciding which samples will 
be compared on a given array. The idea is to accurately measure the 
parameters of interest using the smallest total number of arrays. For large 
experiments, identifying the best experimental design can be complicated 
(Kerr and Churchill, 2001), but for more limited experiments some simple 
principles apply. In most cases where a mutant is being analyzed, it is best to 
directly compare the WT and mutant strains, grown under the conditions of 
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interest, on the same array. This way the influence that the mutant has on 
gene expression is determined with the errors from a single array. An 
alternative approach is to measure gene induction after stimulus in the 
WT strain on one array (e.g., WT + stress versus WT in no stress) and 
the gene induction after stimulus in the mutant strain (e.g., A A + stress 
versus AA in no stress) on a separate array. In this commonly used scheme 
the influence that a mutant has on gene expression can only be calculated by 
dividing the values from the two arrays and thus the errors from each 
individual array are multiplied. However, in the case where mutant effects 
are measured on a single array, it is still important to examine expression in 
the WT strain (e.g., WT + stress versus WT in no stress) to ensure that the 
genes influenced by mutation are also regulated in the WT background. 
Where double mutants are examined it is best to compare the expression 
levels directly to the single mutant(s) on the same array. This reduces the 
magnitude of the change in gene expression seen on the array and thus the 
overall noise as its major component is proportional to signal. 

2.2. Cell growth 

Once the experiments are designed, cells need to be grown and harvested 
carefully to limit unwanted sample-to-sample variation. First, strains that are 
going to be compared on a single array should be grown at the same time 
and in identical medium as variation in nutrient levels, temperature, and 
other parameters can introduce substantial noise. Second, the strains should 
grow for at least two doublings to wash out any differences in the overnight 
cultures. Third, cells should be harvested at the same optical density (OD) 
and this OD should be selected so that the cells are approximately one 
doubling away from any transition induced by nutrient depletion. For 
example, if cells are studied in log growth phase they should be harvested 
at a density substantially below that of the diaxuic shift to ensure that small 
variations in cell number do not affect the gene expression pattern (for 
5. cerevisiae OD 600 = 0.6 works well). Finally, it is best to harvest cells by 
filtration using a 0.2 Jim 90 mm filters (Millipore). This filter is then rolled, 
placed in a 50 ml conical tube and submerged in liquid nitrogen. This 
ensures rapid harvesting (< 1 min) and little sample-to-sample variation. 
For the protocol described below it is best to harvest between 75 and 
150 OD 600 /ml units. Once harvested the cells can be stored at —80 °C 
for several weeks before the RNA is extracted. 



2.3. Total RNA isolation and purification 

To extract the mRNA from the frozen cells, first add 12 ml of AE buffer 
(50 mM sodium acetate (pH 5.2) and 10 mMEDTA) to the 50 ml conical 
tube and then rotate/shake the tube to remove the cells from the filter. 
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When this is done for each of the tubes in the set (no more than eight at a 
time, but always prepare samples to be compared on the same array at the 
same time), transfer the cells and buffer to a 50 ml centrifuge tube and add 
800 jA of 25% (w/v) SDS to each of the samples. Finally, add 12 ml of 65 °C 
acid phenol (Fluka #77608, note that this is not standard buffer-saturated 
phenol) to each tube and incubate for 10 min in a 65 °C water bath, 
vortexing every 30 s or so. When this procedure is finished, cool the samples 
on ice for 5 min and then centrifuge for 20 min at 12,000 Xg and 4 °C. 
Carefully decant the phenol— buffer mix from these tubes into prespun 
(5 min at 1500 rpm) phase lock tubes (5-Prime) and then add 13 ml of 
chloroform to each tube, mix vigorously, and spin at 3000 rpm for 10 min. 
After centrifugation the RNA will be in the 10—12 ml aqueous layer at the 
top of the tube, separated from the organic phase by the gel in the phase lock 
tubes. Pour this solution into a clean centrifuge tube and add 1 ml of 3 M 
sodium acetate (pH 5.2) and 10 ml of isopropanol; mix by inverting the tube 
several times and spin for 45 min at 17,000 Xg and 10 °C. At this point the 
total RNA will be present in an approximately 1 cm white pellet. Carefully 
decant the isopropanol and add 10 ml of 70% ethanol to each tube, without 
disturbing the pellet, and spin again at 17,000 Xg and 10 °C but this time for 
20 min. Decant as much of the ethanol as possible and then spin the tube at 
17,000 Xg again for 1 min to collect any remaining liquid at the bottom of 
the tube and remove it carefully with a pipette. Finally, let these pellets dry 
on the bench until they are translucent (30—60 min). 

Once the total RNA pellets are dry, resuspend them in 800 jA of 
RNase-free water and measure the absorbance at 260 and 280 nm. 
The sample should have >2 mg of RNA and an A 2 ^o / '^280 >2.0. It is 
also useful, especially in the first few RNA preparations or if problems are 
encountered, to check the integrity of the RNA sample using an Agilent 
Bioanalyzer or a similar device (agarose gels are only useful for detecting 
severe degradation). Here a good quality sample should have distinct 
rRNA and tRNA bands with well-defined edges. If the sample fails in 
any of the quality controls it should be discarded. 

2.4. Purification of poly-A RNA 

To isolate mRNA from the total RNA sample, cellulose resin with a poly- 
deoxythimidine oligomer attached (oligo-dT cellulose) is used to purify 
transcripts with a poly-A tail. This purification should be done on the same 
day as the total RNA purification to limit degradation. First, wash 60 mg of 
cellulose resin three times with 750 jA of NETS buffer (0.6 MNaCl, 10 mM 
EDTA, 10 mM Tris-HCl (pH 8.0), 2% (w/v) SDS) in a 2 ml screw cap 
tube. Here the resin should be spun at 3000 rpm for 1 min on a benchtop 
centrifuge between washes and the buffer removed by aspiration. At this 
stage, incubate 750 jA of 2— 4 mg total RNA at 65 °C for 10 min, and then 
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add to a 2 ml tube along with 750 fA of 2x NETS buffer also at 65 °C. Each 
tube should then be left to mix on a rotator for 1 h at room temperature. 
After incubation, apply this sample to a disposable column (BioRad #732- 
6008) that has been washed once with NETS buffer. Once the resin has 
settled in the column, wash it three times with 750 fA NETS buffer and then 
elute the mRNA with 650 fA of 65 °C ETS buffer (NETS buffer without 
the NaCl) by injecting it directly into the column bed. Finally, add 65 fA of 
3 M sodium acetate and 650 fA of isopropanol to these tubes, mix well by 
inversion, and incubate at —20 °C overnight. The next morning spin the 
sample at full speed in a benchtop centrifuge for 1 h at 4 °C. When the spin 
is complete, remove the isopropanol buffer mix from the tube and add 
250 fA of 70% ethanol to the sample, taking care not to disturb the pellet, 
and spin at full speed for 20 min at room temperature. Carefully remove the 
ethanol from these samples and allow them to air dry completely (residual 
ethanol will inhibit the reverse transcription in the next step). When dry, 
the pellets will become white and powdery. Resuspend these pellets in 20 fA 
of RNase-free water and then spin for 1 min at full speed to remove 
the cellulose fragments that were not trapped by the column. Remove the 
supernatant and measure the absorbance and A 2 ^o / '^280 ratio. This is 
best done on a nanodrop spectrophotometer (Thermo Scientific) due to 
the small volume. The yield should be between around 20 fig and the 
y4 2 6o/^280 ratio again greater than 2.0. 

2.5. Reverse transcription and dye labeling 

On the same day as the mRNA is purified it should be converted into 
cDNA by reverse transcription. Combine four micrograms of poly- A RNA 
with 5 fig of an oligo-dT primer (T 2 o) and 5 fig of a random primer (N 9 ) in a 
total volume of 15.5 fA and incubate at 70 °C for 8 min before cooling on 
ice. Once cool, perform cDNA synthesis using AffinityScript reverse tran- 
scriptase (Stratagene) by adding the RNA and primer mix to 2 fA of 
enzyme, 3 fA of AffinityScript buffer, 3 fA of 100 mM DTT, 5.9 fA of 
water, and 0.6 fA of 50 X aa-dNTP mix and incubate at 42 °C for 2 h. Here 
the aa-dNTP mix is made up of 1 mg of amino allyl-dUTP, 20 fA of water, 
30 fA of 100 mM dTTP, and 50 fA of 100 mM A, C, and GTP (store this 
nucleotide mix at — 20 °C in single use aliquots). This reaction will result in 
a cDNA library where approximately 1/10 bases have a free amine group 
that can be labeled by Cy5 or Cy3. To degrade the RNA template, add 4 fA 
of 1 MNaOH and 8 fA of 50 mMEDTA and incubate at 65 °C for 10 min. 
Finally, neutralize the solution using 40 fA of 1 M HEPES, pH 7.0. 
The cDNA can then be purified using a Clean and Concentrator-5 Kit 
(Zymo Research) following the manufacturer's instructions except that the 
cDNA— HEPES buffer mix should be mixed with 1 ml of binding buffer 
before applying it to the column. Elute the DNA from the column in 12 fA 
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of water and then determine the yield and A 2 eo^A 2 so ratio using the 
nanodrop spectrophotometer; they should be >2 fig and 1.8, respectively. 
The cDNA samples can be stored at —20 °C for many weeks before 
labeling and hybridization. 

Once the cDNA is synthesized it needs to be labeled with either Cy5 
(typically the sample) or Cy3 (typically the control). To do this, add 1 fA of 
1 M sodium bicarbonate buffer (pH 9.0) to 10 fA of the cDNA solution and 
add 1 fA of N-hydroxysuccinimidyl ester Cy5 or Cy3 (GE Biosciences) that 
has been resuspended in DMSO and incubate at room temperature for 4 h. 
Each Cy dye pack has enough dye to label 4—8 samples. After labeling, 
purify the samples using the Clean and Concentrator-5 kit, following 
the manufacturer's instructions, and again measure the concentration and 
A260/A2S0 ratio using the nanodrop spectrophotometer. The purified 
and labeled cDNA should have visible color after labeling. If necessary, 
these samples can be snap frozen in liquid nitrogen and stored at — 80 °C. 

2.6. Hybridization 

The labeled cDNA samples are now ready to hybridize to a DNA micro- 
array. Many types of microarrays are available for gene expression analysis 
and each has its advantages and disadvantages. The most common of these 
are the printed arrays where PCR products or DNA oligomers are spotted 
onto a polylysine-coated slide and commercial arrays from companies such 
as Agilent and Roche Nimblegen where oligomers are synthesized directly 
on the slide. The advantage of printed arrays are that, when thousands of 
arrays are used, the cost can be as low as $20 per array. However, printed 
arrays have several distinct disadvantages. First, there tends to be significant 
variation in the quality and size of the DNA spots. This makes it difficult to 
accurately determining the expression ratio for some genes and contributes 
substantially to replicate variation. Second, these arrays have to be post- 
processed to neutralize the otherwise highly charged lysine surface. This 
postprocessing leads to imperfections on the surface of the slide and sub- 
stantial background variation. Finally, the polylysine coated surface is deli- 
cate and is often damaged in the printing and hybridization procedure 
leading to further background noise. By contrast, the commercial arrays 
are printed with high accuracy and on more stable substrates, resulting in 
very low background noise. Moreover, the stability of the slides surface 
means that the sample applied to these arrays can be vigorously mixed 
during hybridization. This dramatically increases the signal-to-noise ratio 
and means that far less cDNA needs to be applied to the array during 
hybridization. The cost of such arrays is presently greater than $100 a 
piece, but this continues to fall. Here, I will describe hybridization to 
Agilent arrays (see the manufactures website for further details), but the 
same samples can be hybridized to arrays from other companies following 
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the manufacturer's instructions, or to printed arrays using protocols from 
Derisi Lab (DeRisi et al, 1997). 

The eight array format from Agilent (eight, 15,000 spot arrays per slide) 
works well when 100 ng of Cy5- and 100 ng of Cy3-labeled cDNA is 
hybridized to each array. To do this, bring 125 ng of Cy5- and 125 ng of 
Cy3-labeled sample to 25 fA total volume. Heat this sample to 98 °C for 
2 min. Cool the sample by centrifugation for 1 min and then add 25 fA of 
2x Hi-RPM hybridization buffer (Agilent) to the sample. Carefully apply 
40 fA of the sample to the gasket slide positioned in the base of SureHyb 
chamber, repeat seven more times with different samples to fill each posi- 
tion, and place the microarray face down onto this gasket slide. Add the top 
to the hybridization chamber, tighten the screw, and then rotate the 
chamber slowly to ensure that the large bubble in the chamber moves freely 
(this bubble is critical for mixing during the hybridization) and that no 
smaller bubbles are stuck to the array surface. If any bubbles remain fixed to 
the array, gently tap the chamber on a hard surface until they are dislodged 
and then place in the rotating oven (Agilent) at 65 °C for 17 h. 

2.7. Microarray washing 

Once the sample is hybridized to the array, excess sample and cDNA that is 
bound nonspecifically must be washed off and the array dried. Listed here is 
a protocol that works well for Agilent arrays; those working with other 
types of arrays should select the appropriate alternative protocol. 

Fill two slide staining dishes with wash buffer I (6x SSPE, 0.005% 
iV-Lauryl sarcosine where SSPE is 150 mM NaCl, 10 mM sodium phos- 
phate, and 1 mM ETDA, pH 7.4), a third chamber with wash buffer II 
(0.06 X SSPE, 0.005% N-Lauryl sarcosine), and a fourth chamber with 
ozone protection and drying solution (Agilent). Place a slide rack in the 
second chamber and set chambers 2—4 up on stir plates at a medium setting, 
ensuring that the stir bars used are small enough to remain below the bottom 
of the slide rack. Submerge the first array in chamber 1 and pry it open using 
plastic tweezers so that the gasket slide falls away. Gently move the array 
from side to side while submerged in the chamber to remove any bubbles 
from the surface and then place it in the rack in chamber 2 (only handle the 
label on the array). Repeat this process until all the slides are in chamber 
2 and then leave the arrays in this low-stringency wash for 1 min. At this 
stage transfer the entire slide rack to chamber 3, ensuring that the slides 
spend minimal time exposed to the air and that little of the wash buffer I is 
transferred into this new chamber. The slides should then be allowed to sit 
in this high-stringency wash for exactly 1 min before transferring the rack to 
the drying solution for 30 s. At this time slowly lift the rack out of the 
chamber, ensuring that droplets do not form on the surface of the array. 
Now the array is dry and ready to scan. 
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2.8. Array scanning 

To quantify the amount of Cy5- and Cy3-labeled DNA hybridized to each 
probe (spot on the array), the washed array is analyzed with a laser scanner 
that excites both Cy5 (at 635 nm) and Cy3 (at 532 nm) and then measures 
the emitted light at the appropriate wavelength (570 and 670 nm, respec- 
tively) using a photomultiplier. It is best to perform the scan immediately 
after washing the arrays, but arrays can be stored in a dry ozone-free area for 
a day or more before scanning if necessary. The most common scanners are 
the Axon 4000B from Molecular Devices and the DNA micro array Scanner 
from Agilent. 

To get the best quality data from the arrays, the scanner needs to be set 
up appropriately. First, the lasers should be focused onto the surface of the 
slide that is spotted with the probes. This focal plane is easily identified as the 
position where a scan gives the highest overall signal. Next, keeping the 
laser power at 100%, the voltage applied to the photomultiplier tube should 
be set to ensure that the highest Cy5 and Cy3 signals measured are just 
below the maximum of the digitization range (65,536 for a 16-bit A to D 
converter). This ensures the highest possible signal-to-noise ratio without 
losing data at some probes due to saturation. Often it is necessary to scan 
arrays several times to find such settings. This does not present a problem as 
the Cy dyes are photostable. 

The resulting image should show clear spots, each with little pixel-to- 
pixel variation in the Cy5 and Cy3 ratio, surrounded by a uniform back- 
ground with low signal (40—50 units on 65,000 unit scale, Fig. 1.2A). Poor 
quality array images generally indicate a problem with sample labeling or 
with the wash and hybridization steps. If the signal/noise is uniformly low in 
a single color, the problem is likely to be poor labeling. This is often due to 
degraded dye. If the signal/noise is poor in both colors (Fig. 1.2B and C), 
the problem likely lies in the wash (low signal; overly stringent washing, 
high background; poor washing). However, poor labeling in both channels 
can lead to similar problems. Large regions without signal can be caused by 
bubbles on the surface of the array or by leakage during hybridization 
(Fig. 1.2D). Finally, speckles on the array are often caused by dust or 
precipitate in the hybridization or wash buffers. Array images should be 
saved as a high-resolution tif image file for further analysis 



2.9. Gridding and normalization 

Once a microarray image has been collected the precise position and 
identity of each probe (or "spot") must be identified. This is done through 
a process known as gridding. First a grid file must be assembled; this 
establishes the identity and expected location of each spot on the array. 
The grid is then overlaid onto the array and any variation in spot location or 
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Figure 1.2 High-resolution DNA microarray images showing common hybridization 
artifacts. (A) A high-quality array as defined by a high signal-to-noise ratio, a low 
(background level) signal at the negative control spots, and an absence of washing or 
hybridization artifacts. (B) A poorly washed array showing the characteristic high 
background signal. (C) An array with variable background signal caused by cDNA 
precipitation and poor washing. Such precipitation is often caused by loading too much 
cDNA onto the chip. (D) An array with nonuniform hybridization due to leakage from 
the hybridization chamber. 



size adjusted for through a probe-by-probe alignment. The gridding soft- 
ware that comes with most scanners does this automatically but in many 
cases some manual adjustments need to be made. At the end of the process, 
spots that overlap with artifacts on the array, or are highly irregular, should 
be flagged to ensure that they do not affect the downstream analysis. At this 
stage, the Cy5 and Cy3 signal intensity at each probe is determined within 
each spot using the same software. While this data represents the gene 
expression changes measured on the array, normalization is required before 
detailed analysis can be performed. First, any systematic difference between 
the Cy5 and Cy3 signals, due to differences in the quantum yield and the 
amount of cDNA loaded on the array, must be corrected. For printed arrays 
this can be accomplished by multiplying the Cy5 and Cy3 signals by a 
constant so that their average, across all spots, is the same. Care must 
be taken, however, to ensure that this is an appropriate normalization. 
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For example, when a strain with the gene for a global repressor deleted is 
compared to the WT strain, the average Cy5 and Cy3 signals are expected 
to differ systematically. In this case a set of spike-in control RNAs, or an 
appropriate subset of genes, can be used to normalize the Cy5 and Cy3 
signals. For commercial arrays, where the DNA in each spot is at high- 
density and precisely aligned, the Cy3-to-Cy5 ratio tends to be nonlinear 
as a function of signal intensity, due to dye quenching and other effects, 
and thus the more sophisticated locally weighted scatterplot smoothing 
(LOWESS) normalization procedure should be used. Finally, the data for 
the spots with weak intensity need to be thrown out or at least weighted 
appropriately. One simple way to do this is to eliminate all data where both 
the Cy3 and Cy5 signals are less than 1.5-fold above the background. 
For printed arrays the background is determined by the signal around 
each spot, while in commercial arrays it is determined by the signal at 
negative control spots printed at various positions on the array. The former 
method is critical for subtracting away a variable background signal, while 
the later method is better where the surface chemistry inside and outside 
the spot are different and background variation is negligible. As an alterna- 
tive to throwing out data with low signal, pixel-to-pixel variation in the 
background and spot intensity can be used to calculate an error range for 
each spot and these error values propagated through the analysis. Such data 
filtering and normalization can be carried out in wide number of databases 
(e.g., the Stanford Microarray Database or Rosetta Resolver). Such data- 
bases are also extremely useful for storing microarray results and images and 
building tables of data from multiple arrays. These tables can then be fed 
into one or more of the wide range of microarray analysis packages avail- 
able, or programs such as R and MATLAB (The MathWorks), for detailed 
analysis as described further in Chapter 2 of this volume. 
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Abstract 

Microarray experiments offer a potential wealth of information but also present a 
significant data analysis challenge. A typical microarray data analysis project 
involves many interconnected manipulations of the raw experimental values, 
and each stage of the analysis challenges the experimenter to make decisions 
regarding the proper selection and usage of a variety of statistical techniques. 
In this chapter, we will provide an overview of each of the major stages of a 
typical yeast microarray project. We will focus on providing a solid conceptual 
foundation to help the reader better understand each of these steps, will 
highlight useful software tools, and will suggest best practices where applicable. 




1. Introduction 

1.1. Overview 

Figure 2.1 illustrates a typical data analysis scheme for a microarray experi- 
ment. The first challenge in a microarray project is to develop a clear 
definition of the biological question of interest, as all subsequent tasks will 
be guided by the goals of the study. Given the material and time costs 
associated with microarray experiments, it is well worth designing a data 
analysis scheme in tandem with your experimental approach, to ensure that 
the data you produce will be sufficient to rigorously address your question 
of interest. Once an overall goal for the project is set, the key experimental 
design choices include the array platform, probe construction and sequence, 
and experimental protocol. As these aspects of a microarray experiment are 
covered elsewhere in this series, below we will focus our discussion on the 
experimental design decision which has the greatest impact on data analysis: 
single-sample versus competitive hybridization. Once an experiment has 
been performed, data analysis begins with the acquisition of digital images 
describing the signal intensity associated with each "spot" or probe on an 
array. Image analysis results in a table linking each probe on the array with a 
quantitative response value. In the preprocessing stage, these raw response 
values are subjected to quality control and are normalized to allow for a fair 
comparison of measurements between probes on a single array and among 
different arrays. Once a high-quality set of comparable values has been 

See Chapter 1 by Andrew Capaldi and Chapter 3 by Maki Inada and Jeff Pleiss in this volume. 
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Figure 2.1 The flow of information in a typical microarray data analysis project. 
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obtained which describe the behavior of each unique target in each experi- 
mental sample, results can be visualized with a variety of clustering tools 
or subjected to statistically rigorous methods for identifying targets that 
are affected by each experimental treatment. Finally, after we have identi- 
fied an interesting set of mRNAs, genes, or other targets, there are a number 
of ways to explore biologically meaningful associations among members of 
the set. 

In this chapter, we will walk through each of these major steps in the 
chronology of a microarray data analysis scheme. The goals of this chapter 
are to introduce common terminology, provide a conceptual basis for 
understanding the major design concerns, and suggest useful software and 
statistical tools where applicable. 

1.2. Commonly used terms in microarray data analysis 

Microarray platforms and experimental designs have grown increasingly 
varied as new uses for microarray technologies have emerged. The termi- 
nology used to discuss microarray data analysis procedures is often confusing 
to newcomers because the vocabulary must be sufficiently abstract to 
describe a wide range of technologies and assays. Here are some useful 
working definitions: 

Microarray A microarray can be broadly defined as a high-density grid 

of probes attached to a two-dimensional surface. 

Platform Platform specifies a particular array technology. This 

includes the choice of surface material (glass, silicon), 
probe material (PCR product, oligonucleotide), probe 
targets (ORFs, SNPs, tiled genome-sequence), and 
manufacturing process (robotically printed, 
photolithography, ink-jet). 

Probe A nucleic acid bait sequence, usually a DNA 

oligonucleotide or PCR product, designed to hybridize 
to a specific target in samples loaded onto the surface of 
an array. Arrays may include more than one probe 
sequence for each target. 

Spot, feature A location on an array where a specific probe has been 

printed or synthesized. Many array designs contain 
replicate spots, meaning that a given probe is 
represented by more than one spot on the array. 

Biological The starting material from a biological experiment, such 

sample as total RNA or DNA. In some designs this is directly 

labeled and loaded onto the microarray, in others it is 
first converted (e.g., from RNA to cDNA). 
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Target A molecular species in the sample for which there is a 

specific probe on the array. 

Label Labels are attached to targets to allow for the detection of 

material hybridized to each spot on the array. Usually 
labels are fluorescent dyes, such as Cy3 and Cy5, which 
are excited by a laser and detected using dye-specific 
filter sets. 



1.3. A simple case study 

Consider a simple yeast experiment which we will use to illustrate different 
aspects of a microarray data analysis project throughout this chapter. Sup- 
pose we are interested in performing a common class of microarray experi- 
ment: identification of genes which are differentially expressed when yeast 
cells are exposed an environmental stress. Our biological samples in this 
example might be total RNA isolated from cells before the treatment and 
at various time points following the treatment. Our material preparation 
protocol could involve a reverse transcription reaction in which we synthe- 
size a cDNA copy of isolated RNA molecules, followed by a labeling 
reaction in which we attach a fluorescent dye to our cDNA targets. Our 
platform would be a yeast ORF array, containing DNA oligonucleotide 
probes which are designed to hybridize to sequences specific to each gene in 
the yeast genome. 

Our goal in designing this microarray experiment and analyzing 
our results will be to identify genes which are differentially expressed in 
the pre- and posttreatment samples. 




2. Experimental Design: Single-Sample Versus 
Competitive Hybridization 

From a data analysis perspective, one of the most important decisions 
you will need to make when planning your microarray experiment is how 
to setup your sample hybridizations. In single-sample, or "one-channel," 
experiments each biological sample being assayed is individually hybridized 
on an array. In a competitive hybridization, or "two-channel" design, two 
or more biological samples with unique dye-labels are mixed and hybridized 
together on an array. Typically, your choice of array platform will be tied to 
your preferred hybridization strategy. 



24 Gregg B. Whitworth 

2.1. Single-sample hybridization 

A single-sample design can be appealing because it is both conceptually 
simple and analogous to other common DNA or RNA detection proce- 
dures (e.g., northern blot analysis). In these experiments, target molecules 
from each biological sample are labeled with fluorescent dye and hybridized 
to the surface of a single array. The intensity of the label associated with each 
probe is quantified using digital image analysis and these intensity values are 
used in all subsequent manipulations to represent the quantity of each target 
in the starting sample. In order for label intensities to be comparable 
between probes, the labeling procedure must operate at a similar efficiency 
on each target probed by the array. This means that labeling must be 
independent of the sequence length or base composition of the target. 
To meet this requirement, end-labeling schemes are commonly used, 
whereby a single fluorescent molecule is covalently linked to targets. The 
major advantage of single-sample designs is that there is a direct, and under 
optimal conditions potentially linear, relationship between the concentra- 
tion of each target in the original biological sample and the label intensity 
measured on the associated probe. 

If we were to analyze the biological samples in our simple case study 
above using a single-channel scheme we would hybridize end-labeled 
cDNA copies of each of our original biological RNA samples onto indivi- 
dual yeast ORF arrays. The absolute dye intensity imaged on each spot 
would be used to represent the quantitative expression level of the asso- 
ciated gene target. We could then compare expression levels between 
samples mathematically. For example, we might ask what the fold-change 
in expression levels of each transcript is between a pre- and posttreatment 
sample by comparing probe intensities observed on two different arrays. 



2.2. Competitive hybridization 

While single-sample designs attempt to measure the absolute levels of probed 
DNA or RNA species, competitive hybridization designs measure the 
relative concentration of each probed species in two or more samples. In 
standard competitive hybridization designs, two biological samples, usually a 
"reference" or control sample and an "experiment" or treatment sample, are 
labeled with two different dyes. The samples are then mixed and hybridized 
on the surface of an array. Two digital images are acquired for each array 
using a wavelength filter specific to the emission spectrum of each of the two 
dyes. During image analysis, the ratio of the two dye intensities is calculated 
for each spot. This ratio value is then used in all subsequent analysis steps 
to represent the relative level of targets in the two starting samples. 

Competitive hybridization designs can be more difficult for those new to 
microarrays to understand, because each measurement is inherently relative, 
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rather than absolute. However, competitive hybridization also offers a number 
of significant advantages over single-sample designs. Competitive hybridiza- 
tion schemes tend to be insensitive to differences in the relative efficiency with 
which different target sequences pass through the sample preparation protocol 
and are detected on the array. In a typical gene expression experiment, 
for example, we can imagine that the efficiency of RNA isolation, cDNA 
synthesis, dye -labeling, and probe hybridization could each be affected by the 
length and base composition of each target. Because competitive hybridization 
arrays measure the relative levels of the same target sequence from two different 
starting biological samples on each probe, differences in the performance 
characteristics of individual target sequences are not as relevant to the measure- 
ments being made. 

One disadvantage to competitive hybridization designs is that, with two 
samples used per array, there is twice the material cost for each. A second 
consideration to be aware of is that no two dyes will perform identically in 
an array experiment. In two-color experiments using Cy3 and Cy5, for 
example, there is often a "green" or Cy3-shift among low-intensity spots. 
For this reason, it is important to flip the association of dyes and samples in 
replicate array experiments ("dye-flipped" replicates), to ensure that obser- 
vations are not the result of a dye-intensity bias. In Section 4, we will 
also discuss computational methods which allow us to assess and mitigate 
dye-intensity bias. 

Competitive hybridization designs are not limited to two samples. 
Conceptually the only limit to the number of samples that can be put on 
a single array is the number of unique dye wavelengths that can be reliably 
differentiated. Several commercial scanners now support up to four dyes 
and several data analysis packages are already prepared to handle arbitrarily 
large numbers of dye-channels. 



2.3. Choosing the best approach 

Both of these hybridization strategies have been used in the microarray field 
to produce accurate and reproducible data. A good litmus test for which 
approach is best suited to your study may be to consider whether or not 
your question of interest is comparative. In the case study introduced above, 
our biological question is indeed comparative: we are interested in measur- 
ing changes in the relative abundance of transcripts in an experimental and 
control sample. We have a clear reference, the pretreatment sample, to 
which we want to compare transcript abundance in posttreatment samples. 

This is not to suggest that probe intensity is not a consideration in competitive hybridization experiments. 
As we discuss below, for example, it is important to be aware of dye-specific intensity biases. However, with 
proper normalization, competitive hybridization designs can be extremely robust across a very wide range of 
probe intensities. 
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In this example, the competitive hybridization approach allows us to per- 
form this comparison directly, rather than as mathematical manipulation, 
thereby simplifying our data analysis approach and avoiding the potential 
propagation of error. 

The advantage of competitive hybridization designs for comparative 
studies becomes even stronger when we consider investigations involving 
more than one experimental factor. Suppose we are interested in testing 
both a wild-type (WT) and mutant strain in our example stress experiment. 
In a first pass experiment, we might hybridize a pre-stress sample against a 
post-stress sample for each of these two strains, on two different arrays. How 
would we interpret our results if we observed subtle differences between the 
two strains, which could plausibly be either biologically meaningful or 
simply due to random variation? In a competitive hybridization design we 
can add a third array into the mix which makes this comparison directly, 
hybridizing poststress samples from the WT and mutant strains on the same 
array. Because at least one of two original biological samples has been 
hybridized onto each of these three arrays we have a powerful tool for 
ensuring that the observations made on each array are directly comparable. 
Linear analysis packages such as Limma, discussed in Section 6, allow us to 
use competitive hybridization designs such as this to rigorously assess the 
statistical significance of apparent changes in expression level of a given 
target between two samples. 




3. Image Analysis 

The computational phase of a microarray experiment begins with the 
analysis of digital images. Image analysis usually encompasses the following 
steps: 

(1) Identify regions in the image which represent spots where probes have 
been printed. 

(2) Calculate the average intensity of pixels within each spot (foreground 
pixels) . 

(3) Calculate the average intensity of pixels which lie outside of spots 
(background pixels) . 

(4) Associate spots with platform annotations (probe identifiers). 

In this section, we will consider the information held in image files and 
how it is used to describe the quantity of target material hybridized to 
probes on an array. 

In performing this experiment, we would actually use at least six arrays to obtain data from dye-flipped 
replicates of each experimental contrast. We would probably also perform a fourth type of array, comparing 
prestressed WT and mutant samples to isolate the effect of the mutation alone. 
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3.1. What is a digital image? 

To fully understand the processing of microarray images, it is important to 
understand how information is stored in a digital image. The simplest digital 
image formats store a table of intensity values corresponding to each pixel 
position in the image. The resolution of an image is equal to the dimensions 
of this table, usually expressed as the "number of columns X the number of 
rows," for example, "1024 X 1024." Microarray images are gray-scale, 
meaning that each pixel has a single intensity value associated with it. The 
range of intensity values for each pixel depends on the bit-depth of the 
image. Typically, microarray images are 16-bit, meaning that pixel inten- 
sities can range in value from (black) to 65,535 (white). Do not be 
alarmed if you open a microarray image in your favorite image viewer 
and it appears to be blank. Microarray images often fail to render properly in 
photo applications because these software packages are usually designed to 
handle only 8-bit images. 

When working with micro arrays, it is important to be aware of the 
file format being used to store array images. Microarray images should 
always be kept in loss-less formats. Compressed image formats, such as 
JPEG, discard pixel information during compression in the interest of 
decreasing overall file size. Although it might be tempting to choose a 
compressed format for long-term storage, because microarray image files 
can be quite large, storing images in a lossy format is the digital equivalent to 
throwing away data. 

The tagged image file format (TIFF) is a commonly used to store 
microarray images. By default these files store data uncompressed. Each 
TIFF can hold an arbitrary number of images, called layers. This feature is 
used in competitive hybridization experiments to save the images from both 
dye-channels in a single file. As the name suggests, TIFFs can also hold a set 
of user-defined tags. Microarray scanner software will often save informa- 
tion about the image acquisition session to these files such as the scanner 
temperature and laser settings. Other common file formats which offer 
loss-less compression include PNG (portable network graphics) and GIF 
(graphics interchange format). 



3.2. Data files 

Input. Image analysis consumes two types of data: the digital images them- 
selves and platform-specific probe annotations. Probe annotations are 
usually saved in text files which associate spot addresses on the array with 
information about the probes printed on each spot. Examples include 



Each bit in the file has one of two values (binary), so with 16-bits per pixel this is 2 possible values. 
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GenePix GAL files and Affymetrix CDF files. These files allow us to link 
each observation in the array data to: 

(1) a spot location, or set of locations, on the surface of the array (informa- 
tion that can be used in quality assessment) 

(2) a probe sequence (necessary for MIAME compliance). 

Output. Image analysis produces a table of information describing various 
properties of each spot on the array. Usually these files have as many rows as 
there are spots on the array. Example formats include GenePix GPR files, 
Affymetrix CEL files, and SPOT files. Average intensities of the pixels 
associated with each spot are extracted from these files in the preprocessing 
step. Information stored in these files often describes the shape of each 
spot and variation in pixel intensities, both of which can be used to flag 
low-quality spots during spot filtering. 

3.3. Software tools 

3.3.1. Commercial packages 

Most commercial microarray scanners are sold with accompanying licenses 
for image analysis software. For example, GenePix is licensed with Axon 
scanners and offers an integrated image analysis solution, including control 
of the scanner to acquire images, import of probe information, spot finding, 
and pixel intensity analysis. One advantage of commercial image analysis 
packages is that they tend to offer integrated, user-friendly interfaces. 
Unfortunately, it can be quite costly to obtain a license for these packages 
in the absence of a hardware purchase. Also, as with any closed-source 
software, it is usually not possible to obtain detailed information about the 
implementation of image analysis algorithms. 

3.3.2. Open-source and freely available packages 

Although none of the free solutions have reached the maturity of most 
commercial packages, there are several projects worthy of note: 

ScanAlyze (http://rana.lbl. An open-source microarray image 

go v/EisenS oftware.htm) analysis package written and 

maintained by Michael Eisen. 
(Windows only) 
Micro Array _Profile (http:// An ImageJ (http://rsbweb.nih.gov/ij/) 

www.optinav.com/imagej. plug i n that can be used to analyze 

html) microarray images. Micro Array_ 

Profile allows you to define a spot 



See Section 8.2. 
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grid, automatically adjust spot 
diameters and export labeled mean 
pixel intensity values. (Cross- 
platform) 
Spot (http://www.cmis.csiro. An R package that can be used to 
au/iap/Spot/spotmanual. perform microarray image analysis, 

htm) originally written by Yang et ah 

(2001). Installation instructions and a 
step-by-step user guide are both 
available on the web site. 
(Cross-platform) 




4. Preprocessing 

Data preprocessing can encompasses a number of different manipula- 
tions of the raw probe intensity values obtained from image analysis. 
In general, the goals of preprocessing are to ensure that: 

(1) Observations are comparable between different probes on a single 
array. 

(2) Observations are comparable between arrays. 

(3) Low-quality observations are removed from the dataset. 

Preprocessing procedures are dependent on the array platform and 
hybridization scheme. In the sections that follow we will focus our discus- 
sion on preprocessing data from competitive hybridization experiments. In 
general, a preprocessing workflow will consume data from image analysis 
files (GPR, CEL, SPOT) and ultimately produce a table which associates a 
single expression value with each target across each unique experimental 
condition. This table then serves as the starting point for higher order 
analysis such as hierarchical clustering or differential expression analysis. 



4.1. Software tools 

The preprocessing procedure is often the most computationally intensive 
and data rich stage of a microarray analysis scheme, usually encompassing 
several transformations of the primary data. Because of this, it is important to 
establish a set of best practices for your lab which ensure that the preproces- 
sing of data are both consistent between arrays and well documented. 

Here we review several useful software tools which can be used to 
perform preprocessing steps. Section 8 at the end of this chapter addresses 



30 Gregg B. Whitworth 

data persistence tools which can be used store and replicate the input to, and 
output from, your preprocessing procedure. 

4.4.1. Spreadsheets 

Common spreadsheet applications include Microsoft Excel, OpenOffice 
Calc, and Google Docs Spreadsheets. Spreadsheet applications are appealing 
working environments because they are familiar to most researchers and 
microarray data structures fit neatly into two-dimensional tables. Spread- 
sheet applications usually feature some basic declarative programing facil- 
ities, for example, allowing the user to calculate values using predefined 
formula based on the data held within a range of cells in a table. There are 
several common pitfalls of spreadsheet software; however, which should be 
considered before committing to using spreadsheets for the bulk of a data 
analysis scheme. First, many spreadsheet applications (with the exception of 
Google Docs) allow the user to perform sorting and filtering operations 
on arbitrary subsets of rows or columns based on the current user selection. 
In a single "click" this can lead to disastrous consequences, such as jumbling 
ratio values and probe labels. Second, spreadsheet packages usually do not 
build a long-term record of changes made to the data by the user (again, 
with the exception of Google Docs), placing the onus of manually recording 
each and every data manipulation on the researcher. Finally, many spread- 
sheet applications impose hard limitations on the number of columns and 
rows present in each table. Before committing to a particular software 
package it is important to verify that the software will support the dimen- 
sions of your array platform and the number of arrays you intend to perform. 

4.4.2. Commercial statistics packages 

Commercial statistical packages such as SPSS or Minitab offer a step-up 
from basic spreadsheet applications. These software packages are usually 
designed to efficiently handle large datasets and offer more robust facilities 
for describing the structure of a tabular dataset than most spreadsheet 
software. These packages also implement a broader range of statistical 
algorithms and usually support more sophisticated programing capabilities. 
Disadvantages of these software packages include the expense of licenses 
and, because they are closed-source, a lack of access to details about the 
implementation of the statistical methods they feature. 

4.4.3. [R] and Bioconductor 

In the domain of microarray data analysis one open-source statistical envi- 
ronment is worthy of special note: R. R is a freely available, open-source 
derivative of the S-Plus system and has attracted the attention of wide range 

http://www.r-project.org/ 
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of users in both the academic and private sectors. There are several features 
which make R appealing as a data analysis environment. First, it is entirely 
open-source. This means that the software is both free to obtain and that the 
implementation details of all statistical methods found in R are freely 
available and open to scrutiny. Second, R has emerged as one of the most 
accommodating environments for statistical research, leading to the estab- 
lishment of an active and innovative community. Finally, R implements a 
full-featured programming language which can be used to abstract and 
automate sophisticated analysis procedures. 

Because of these features, R was chosen as the host environment for the 
Bioconductor suite of microarray data analysis software (Gentleman et al, 
2004). Bioconductor boasts an impressive array of microarray analysis tools 
which support a wide variety of platforms and address both preprocessing 
and higher order analysis. Although R and Bioconductor offer an ideal set 
of microarray data analysis tools, the initial learning curve can be quite steep, 
especially for those unfamiliar with command-line environments or script- 
ing languages. However, if your microarray project extends beyond a 
small number of arrays, spending the time to learn how to use R and 
Bioconductor at the start of a project can save you many hours of work 
down the road. In contrast to working in a spreadsheet application, once 
you have designed a data analysis procedure in R you will never have to 
manually perform it again: any set of commands can be saved in a script file 
and run on new input data. 

To get started with R, one-click installers are available for Windows, 
Mac OSX, and Linux platforms and can be found in the "download" 
section of the R-proj ect.org web site. When you launch R on your system 
you will be presented with an "interactive interpreter" window in which 
you can enter commands. R comes packaged with a number of documents 
in PDF format aimed at the new user. The best place to start is with the 
introductory "R-intro.pdf" On Windows, users can find this document 
from within the "Rgui" application (found in the Start menu after installa- 
tion) by opening the "Help" menu, selecting "Manuals (in PDF)," and 
then "An Introduction to R." The exercises in this document should take 
new users a few hours to work through and will introduce you to all of the 
key features and concepts needed to implement data analysis schemes in R. 

Installing Bioconductor from within the R environment is easy. 
To perform a standard installation, enter the following commands at the 
R prompt (denoted below as a ">"): 

> source("http: //bioconductor. org/biocLite.R") 

> biocLiteQ 



-7 

http : //www .bioconductor. org/ 
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Additional installation options and instructions for installing nonstandard 
packages are available on the Bioconductor web site. Once installed, you 
can load a specific Bioconductor package with the library( ) function. To 
load the "marray" suite of preprocessing functions, for example, enter: 

> library (marray) 

The R help() function can be used to obtain information about the use 
of methods implemented in Bioconductor packages. For example, the 
following command will display information about the usage of the main 
array normalization function ("maNorm") in the "marray" package: 

> help (maNorm) 

Many Bioconductor packages also come packaged with tutorial style 
documents in PDF format called "Vignettes." To access the vignettes from 
within R: 

> library(Biobase) 

> openVignette( ) 

You can then enter the number corresponding to the vignette of interest 
and the associated PDF should open on your system. For microarray 
normalization procedures the "marray" and "Limma" vignettes are great 
places to get started. 



4.2. Calculating ratio values 

In a competitive hybridization experiment, ratios are calculated for each 
spot from the average pixel intensities in each channel. There are several 
methods which can be used to calculate the average pixel intensity for a 
spot. The first option to consider is which descriptive statistic will be used, 
usually the mean or median. On a high-quality spot, the values of the mean 
and median pixel intensities should be very similar. One advantage to 
choosing the median pixel intensity over mean is that it will be more robust 
to small numbers of outliers, which can help to mitigate the effects of small 
scratches (aberrantly low-intensity pixels) or dust (aberrantly high-intensity 
pixels). The second consideration is whether the channel ratios are calcu- 
lated before or after averaging. For example, one can first average the 
intensity of pixels in the red and green channels independently and then 
calculate the ratio of these two averages ("ratio of the means" or "ratio of 
the medians"). Alternatively, one can calculate the ratio of the intensity of 
the red and green channel for each pixel and then average the set of ratios 



http://www.bioconductor.org/docs/install/ 
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("mean of ratios" or "median of ratios"). Again, for a high quality spot, we 
would expect these four values to be similar. 

By convention, we usually assign the "red" (Cy5; 635 nm filter) channel 
to the experimental treatment and "green" (Cy3; 536 nm filter) channel to 
the reference sample, so that this ratio is greater than 1 when a gene target 
increases in abundance in response to a treatment, and the ratio is less than 1 
when a target decreases in abundance. It is important to remember to "flip" 
values, or calculate the inverse of the ratios obtained from image analysis, 
for arrays in which the experimental and reference channels are arranged in 
the reverse orientation. 

Finally, it is also common practice to transform ratio values to a linear 
scale. A log 2 transformation makes ratios particularly convenient to work 
with because it is simple to conceptualize the fold-change in a ratio 
given a log 2 value. For example: log 2 (2/l) > 1, log 2 (l/l) > 0, and 
log 2 (l/2) > -1. 

4.3. Normalizing ratio values 

In a competitive hybridization experiment, the total signal intensities of the 
red and green channels will never be perfectly balanced. As described in 
Chapter 3 in this volume, consideration is given to signal balance in both 
the sample preparation and image acquisition stages. However, a final 
mathematical manipulation of the data is required to account for any 
remaining array-specific biases in signal intensity in order for ratio values 
to be fairly compared between different arrays. Many normalization strate- 
gies also include adjustments for technical bias to improve comparison 
between probes on the same array. 

It is important to carefully consider both the structure of your data and 
the underlying biological question of interest when choosing a normaliza- 
tion scheme. For best practices, it is recommended that plots be made which 
will allow you to assess the overall distribution of ratio values on each array 
before and after normalization. Histograms or density plots such as those 
illustrated in Fig. 2.2 are easy to make in spreadsheet software and statistical 
packages. Boxplots are useful when you want to compare the distribution of 
ratio values across a large number of arrays on a single axis (Fig. 2.3). Finally, 
scatter plots comparing ratio values to spot pixel intensity (in Bioconductor, 
"MA" plots) are useful for assessing the degree to which spot ratios have 
been influenced by a dye-intensity bias. 

4.3.1. Global mean or median normalization 

The simplest form of ratio normalization is a global mean or median 
adjustment. As illustrated in Fig. 2.2, in this normalization procedure each 
ratio is adjusted by a constant value to center the mean or median of the 
distribution of ratios observed on the array. In this example, the ratios on 
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Figure 2.2 A hypothetical distribution of log 2 (ratio) values before and after global 
median normalization. Before normalization this is a "green" array, where the average 
ratio of red/ green is < 1 . After normalization we have shifted the distribution to the 
right, moving the median log 2 (ratio) value to and preserving the shape of the 
distribution. 
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Figure 2.3 Boxplots showing example distributions of log 2 (ratio) across a large num- 
ber of arrays. Log 2 (ratio) values are shown on the y-axis and data from each array is 
plotted in a single boxplot along the x-axis. Distributions are shown before (A) and 
after (B) normalization. Notice that we can easily identify two low-quality arrays as 
outliers on the left-hand side of these graphs. 



our array showed a clear "green" bias (log 2 (ratio) < 0), which we can 
account for by adding a constant value to the log 2 transformed ratios (or 
by multiplying by a constant value if we are working with raw ratios). 

A global adjustment is appropriate if we can assume that the total amount 
of input material from the experimental and reference samples should be 
equivalent. In most cases a median adjustment is preferable to a mean 
adjustment, because it will be insensitive to outliers. This adjustment is 
easy to calculate in a spreadsheet or statistical package. 
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4.3.2. Normalizing to spike-in controls or housekeeping genes 

In cases where we expect a large proportion of targets to show a biologically 
relevant change in expression levels between the experimental and refer- 
ence samples, a global median normalization may not be appropriate. The 
alternative is to normalize features on the array based on the behavior of a 
small subset of spots we expect to not show expression level changes 
between the two samples. One option is to choose probes which target 
4 'housekeeping" genes which we can reasonably assume will not be affected 
by the treatment. A second strategy is to design probes which do not target 
any of the DNA/RNA species in the original biological sample, but rather 
hybridize to a "spike-in" control which can be added to the samples at 
a standard concentration. One disadvantage to both of these schemes is they 
rely heavily on the behavior of a relatively small number of spots, and 
the consistency of the underlying concentration of material probed by 
those spots. 

4.3.3. Adjusting for intensity bias 

A global normalization, based either on the median ratio value or control 
spots, adjusts all ratios using a constant value, irrespective of the signal 
intensity observed on each spot. However, most competitive hybridization 
arrays exhibit a bias in ratio value across different signal intensities. As 
mentioned previously, one common cause for this bias can be variability 
in the behavior of different dyes at different levels of intensity. LOESS 
(local polynomial regression fitting) is a regression-based approach which 
can be used to adjust ratio values based on the observed relationship 
between spot ratio and intensity (Cleveland et ah, 1992; Yang et ah, 
2002). An excellent application of this procedure to microarray data is 
provided by Bioconductor "marray" package. 

4.3.4. Adjusting for spatial or print-tip bias 

Another source of technical error in microarray data which normalization 
can be used to minimize is bias in ratios arising from the physical location of 
spots on the array. Depending on the array platform, location bias can be 
caused by the manufacturing process (e.g., printing efficiencies of different 
print-tips in robotically pinned arrays), the hybridization process or incon- 
sistencies in scanner alignment. Although it is possible to manually imple- 
ment spatially aware normalization procedures using common spreadsheet 
software, Bioconductor offers a number of well developed and convenient 
tools. The marray package, for example, can be used to apply LOESS 
smoothing to ratios based on either printing block (for pinned arrays) or 
the two-dimensional spatial bias in the local neighborhood of each spot. 

See documentation for the "maNorm" wrapper function. 
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The Bioconductor "array Quality" package also offers powerful visualiza- 
tion tools for assessing spatial bias across the surface of an array. 

4.3.5. More aggressive approaches 

If your biological question of interest and the structure of your data conform 
to a stricter set of assumptions than standard expression arrays, there are a 
variety of more aggressive normalization techniques which can be used. 
For example, if you can assume that the variation in ratio distributions 
should be equivalent between arrays you may consider performing scale 
or quantile normalizations (Dudoit and Yang, 2002) . 

4.4. Quality assessment, filtering, and handling replicates 

A data analysis procedure is only as strong as the quality of the data fed into 
it. It is, therefore, important to assess the quality of your array data and 
implement an effective filtering scheme. Quality assessments can be both 
qualitative and quantitative, and can be used to filter individual data points 
or entire arrays. 

4.4.1. Spot quality 

The first stage of spot filtering occurs during image analysis. Spots which 
cannot be found by the image analysis software should be flagged for down- 
stream filtering. Spots should also be flagged if they are smeared, continuous 
with another spot, significantly marred by dust, or scratched. Depending 
on your array platform, it may also be useful to automatically flag spots 
which are bigger or smaller than certain reasonable size cutoffs or for which 
there are extremely high variances in the pixel intensities in either dye- 
channel. Flags can be added to image analysis data tables using a calculated 
column in a spreadsheet or with logical index vectors in R. Generally, flagged 
spots are excluded from ratio normalization calculations. 

4.4.2. Array quality 

As a general rule of thumb, if an array "looks bad" then it probably is. 
The Bioconductor arrayQuality package implements a number of plotting 
techniques which can be used to identify spatial bias across the surface of an 
array. For example, a heatmap of ratio -value ranks on an array will readily 
reveal any spatial bias in spot ratios. If your platform design includes replicate 
spots (identical probes printed in multiple locations) which are distributed 
across different locations on the array it is possible to assess whether or not 
variation in ratio values can be explained by surface position. Replicate spots 

http://bioconductor.org/packages/2.4/bioc/html/arrayQuality.html 
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also allow you to calculate an average variance among replicate groups, a 
parameter which can provide a useful measure of array quality. 

4.4.3. Utilizing spot and array replicates 

There are a number of ways in which technical replication can be used to 
enhance the power of microarray-based investigations. The central question 
of how to best make use of technical replication comes down to when in the 
analysis procedure replicates should be averaged and how to utilize infor- 
mation about the variability underlying each unique measurement. Spot 
replicate averaging can be accomplished by taking the mean or median. 
Whenever you average values, it is also important to calculate a measure- 
ment of variation among the observations contributing to each average. 
Variance and standard deviation are used most commonly. As noted above, 
spot replicate and array replicate variation can be used as a quantitative 
measure of array quality. Information about the variation among replicates 
can also be used to "weight" observations in most clustering algorithms (see 
Section 5), allowing higher quality observations to have a stronger influence 
on the structure of the resulting graph. Finally, replicate variation is also 
used in differential expression analysis (see Section 6). When submitting 
datasets to public repositories (see Section 8.2), it is important to include 
preaveraged data, so that interested parties can independently recreate these 
data analysis steps. 

4.5. Preprocessing Affymetrix arrays 

Although the goals for preprocessing single-sample arrays are similar to 
those for competitive hybridization arrays, the preprocessing approaches 
differ significantly. Single-sample arrays, such as the Affymetrix GeneChip 
platform, often include a series of "Perfect match" (PM) and "Mismatch" 
(MM) probes for each gene target. Preprocessing of these arrays involves 
summarizing PM and MM probe set behaviors. There are a number of 
software packages available which implement different preprocessing 
algorithms for Affymetrix arrays. Two mature and popular options are: 

Expression Affymetrix offers their expression array analysis software 

console free of charge for registered users on their web site 

(http://www.affymetrix.com/support/technical/ 
software_downloads.affx). The latest generation of this 
analysis suite implements several preprocessing 
algorithms including MAS, PLIER, and RMA. 

Bioconductor The Bioconductor project includes a number of packages 

which can be used to normalize data from Affymetrix 
arrays, including: "affy," "gcrma," and "affyPLM" 
(Bolstad et ah, 2003; Irizarry et ah, 2003). 
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5. Visualizing Data Using Cluster Analysis 

One of the most popular ways to explore microarray datasets is with 
clustering analysis. Microarray datasets may contain information about the 
behaviors of ^10k— 1M different probes across dozens or hundreds of 
different experiments, resulting in a grid of data which is far too large to 
simply "browse." Clustering techniques are used to order the data according 
to the behavior of probed targets (genes), experimental factors (arrays), or 
both. Visualization tools are then used to explore the resulting data structure. 

5.1. Hierarchical clustering 

The most common clustering algorithm used in the microarray field is 
hierarchical clustering. Hierarchical clustering is an uncensored machine 
learning method, meaning that it is used to order a dataset based on the data 
alone rather than a predefined model. 

5.1.1. An example use-case 

Let's consider our microarray case study from the introduction. After we 
have analyzed our array images, normalized and log 2 transformed our ratio 
values, and filtered out low-quality data we will be left with a table of 
information which describes the behavior of target genes in each of our 
biological sample hybridizations (Table 2.1). We can use hierarchical clus- 
tering of this table of values to help us identify genes which exhibit similar 
behaviors across this stress time course. In this case, we would only want 
to cluster the gene or target axis, because the experiments already have an 
obvious biologically meaningful order. 

5.1.2. How hierarchical clustering works 

Hierarchical clustering begins by examining a list of elements, in this 
example a list of genes, to identify the two elements which are most similar. 
The results of a hierarchical clustering run are highly dependent on the way 
in which "similar" is defined. Pearson correlation is a straightforward 
similarity metric to imagine in this context: the similarity between any 

Table 2.1 Example data structure produced by preprocessing which can be used for 
cluster analysis 



Array 1 (stress time Array 2 (stress time 

point 1 /control) point 2/control) 

Genel -0.1503 -0.3861 

Gene2 0.3857 0.2168 
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two genes can be calculated as the Pearson correlation of log 2 (ratio) values 
observed across the arrays in the time course. The absolute value of the 
Pearson correlation could be used if we wanted genes which showed large 
changes at the same time point to score as similar, irrespective of whether 
expression levels went up or down. Nonparametric measures such as 
rank order or Euclidean distances can also be used to evaluate similarity. 
Choosing a different similarity metric can dramatically affect the structure of 
a cluster, so it can be beneficial to spend some time exploring your options. 
Once the two most similar genes have been identified in the starting list, 
they are associated on a single branch of a tree. The original data associated 
with these two genes is then removed from the list of elements and replaced 
with a new entry that represents the "average" behavior of both. The 
method by which the "average" behavior of a branch is represented in 
subsequent similarity calculations (the linkage method) is the second major 
parameter which controls the behavior of hierarchical clustering. 

5.1.3. Common pitfalls 

There are several features of hierarchical clustering algorithms which can 
lead new users astray in the interpretation of their results. First, although 
hierarchical clusters are often used to "group" genes or experiments into 
discrete subsets, grouping is not the goal of hierarchical clustering per se. 
Commonly used hierarchical clustering algorithms only operate on the pair- 
wise distances between elements in a list; they do not consider the larger 
structure of the data. As such, these algorithms are "bottom-up" approaches 
and do not necessarily create a tree in which the average distance between 
all elements is fully minimized, nor do they suggest a best cutoff level in 
the resulting tree to produce meaningful subsets of elements. There are, 
however, a number of techniques which have been developed to help 
overcome these limitations of hierarchical clustering, notably the tree 
"pruning" algorithms implemented in the Bioconductor "hopach" library 
(van der Laan and Pollard, 2003). 

When exploring a hierarchical cluster, it is important to remember that 
each node has two equivalent orientations and that the orientation chosen 
when a tree is initially rendered is effectively arbitrary. Flipping the orien- 
tation of a node can dramatically change the visual appearance of a cluster, 
because of changes in the linear order of the gene or array axis, but has no 
effect on the overall pair-wise distances between genes. When considering 
the relatedness of elements on a clustered graph it is important to pay close 
attention to the actual distance between elements in the tree rather than the 
relative order on the plot. 

5.1.4. Software 

The open-source Cluster 3.0 package features an easy-to-use interface and a 
host of clustering procedures (de Hoon et ah, 2004). Cluster 3.0 can read 
data from tab-delimited text files produced in the preprocessing step. 
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Depending on the setup, clustering runs produce files with some or all of the 
following extensions: 

.cdt These files hold the original table of data, with targets (genes) 

listed in each row and sample comparisons (experiments /arrays) 
listed in each column. The CDT tables may also have a gene 
weight (GWEIGHT) column and experiment weight 
(EWEIGHT) row which describe the clustering weight of each 
gene and experiment, respectively. As mentioned above, it can 
be useful to draw weights for a clustering run from measures of 
variance among technical replicates. 

.gtr Gene-tree files contain tables which describe each branch in the 
tree as an association of two child genes/nodes (1st and 2nd 
column) and a parental gene/node (3rd column), as well as the 
distance from children to parent (4th column). 

.atr Array-tree files are identical to gene-tree files, but describe array 
clustering. 

The information saved in these output files can be visualized using the 
open-source, cross-platform, Java TreeView package (Saldanha, 2004). Java 
TreeView draws a heatmap of ratio values from data saved in the CDT file 
and associated gene- or array- trees if similarly named GTR or ATR files are 
found in the same directory. 

5.2. Partitioning and network-based approaches 

In addition to hierarchical clustering, there are a wide variety of other types 
of clustering methods which can be used to explore microarray datasets. 
Several commonly used partitioning methods include: self-organizing maps 
(SOMs) (available in Cluster 3.0 and the "som" R package), fe-means 
clustering (available in Cluster 3.0), and Prediction Analysis for 

11 

Microarrays (PAM) (available as the "PAM" R package and as an Excel 
plug in). Like hierarchical clustering, these partitioning algorithms make use 
of distance metrics and linkage methods to cluster closely associated genes or 
arrays. Unlike hierarchical clustering, partitioning algorithms are designed 
to construct defined subsets of element, although for some approaches, 
like SOMs, the use of stochastic parameters means multiple runs on the 
same dataset may not always produce the same result. 

Among network-based approaches, the BioLayout Express ,J implemen- 
tation of the Markov Cluster Algorithm (MCL) is particularly worthy of 
note (Freeman et al. , 2007) . MCL belongs to a class of algorithms which 
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http://wrwrw-stat.stanford.edu/~tibs/PAM/ 
http://www.biolayout.org/ 
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optimize the overall distances between nodes across the entire set of ele- 
ments being clustered. Network-based approaches are much more flexible 
than hierarchical clustering in the kinds of paths between nodes which can 
be drawn, and can be more conducive to a visual exploration of gene 
groups. BioLayout Express is an open-source, cross-platform software 
package with excellent documentation and a well-developed user-interface. 




6. Assessing the Statistical Evidence 
for Differential Expression 

Microarray data are often used to identify changes in gene expression 
in response to an experimental treatment. Conceptually, we would like to 
be able to extract a unique list of genes from a microarray dataset which 
are differentially expressed in one biological sample compared to another. 
The challenge comes when deciding how to draw a cutoff in ratio values. 
Although a blanket cutoff of a "twofold change" has been used in the 
microarray field in the past, this is not a statistically rigorous approach nor is 
it a reasonable approximation for many datasets. 

Our goal in assessing the evidence for differential expression of targets in a 
microarray dataset can be conceptualized in the same terms as any canonical 
hypothesis testing problem. In this case, the null hypothesis, which we want 
to know whether or not to reject, is that the observed log 2 (ratio) value for a 
particular target is actually no different than 0. Given the variability in the 
measurements observed among replicates, and the average expression 
change across the dataset, we want to calculate statistics which will allow 
us to assess the probability that accepting or rejecting this null hypothesis will 
result in a false-positive (type I error; identifying genes as differentially 
expressed which are not) or false-negative (type II error; identifying genes 
as not differentially expressed which are) determination. However, calculat- 
ing meaningful significance levels for each gene in a large dataset is a complex 
problem. For example, if we test this null hypothesis for 15,000 probes 
on an array using a classical t-test, the high level of multiple testing shifts 
the scale of meaningful p-values away from what we are normally used to 
considering. 

Here we briefly discuss two statistical approaches to this problem which 
are implemented in mature software packages and have been used to great 
effect in yeast studies: 

6.1. Significance analysis of microarrays 

Significance analysis of microarrays (SAM) uses gene-specific £-tests, calcu- 
lated using a nonparametric statistic, to provide an estimate of the false 
discovery rate at a given ratio value cutoff (Tusher et ah, 2001). SAM is 
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flexible enough to handle most common experimental designs, but is less 
versatile in this regard than Limma. SAM is available as both an [R] library 
and a Microsoft Excel plug in. It is free to download for academic users. 

6.2. Limma 

Limma is a Bioconductor software package which uses linear models to 
analyze microarray data and assess evidence for differential gene expression 
(Smyth, 2004). Limma can be used to model virtually any experimental 
design, accounting for sample measurements taken across complex sets of 
competitive hybridization pairs. An extensive user manual and step-by-step 
walkthrough are provided in the Limma documentation. 




7. Exploring Gene Sets 

Having identified an interesting group of targets (e.g., genes) using 
cluster analysis or differential expression assessment, the next question we 
usually want to address is: what is similar about the members of each group? 
Here we review some common approaches to this question. 

7.1. Gene Ontology term mapping 

The Gene Ontology (GO) project is cross-species gene annotation effort 
which defines a controlled vocabulary of terms describing a gene product's 
function, biological process, or cellular component (Ashburner et ah, 2000). 
When presented with a subset of genes which show a similar pattern of 
expression in a microarray dataset, it can be useful to determine whether or 
not the GO annotations associated with those genes suggest a common 
biological function. In GO, terms are related to one another through a 

1 4- 

branched hierarchy. Genes can be annotated with any number of terms 
from any level of this hierarchy. Conceptually the complex structure of GO 
terms is appealing, because it allows for a high degree of flexibility in gene 
labeling. Unfortunately, this structure also makes it difficult to appropriately 
estimate the statistical relevance of a potentially overrepresented GO term in 
a list of genes. The GO Slim Mapper hosted at the Saccharomyces 
Genome Database offers a Web-based solution which assesses the statistical 
likelihood that a GO term is meaningfully overrepresented in a given set of 
genes, plotting the results on a GO term graph. The Bioconductor 

See "userguide.pdf in the "/doc" subdirectory of the limma library or enter "> lib rary (limma) ; limma 
UsersGuide( )" at the R prompt. 

1/1 

More properly, the GO topology is an acyclic graph as child terms can be associated with multiple parents. 
http://www.yeastgenome.org/cgi-bin/GO/goSlimMapper.pl 
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"GOSemSim" package ' also provides methods for estimating GO seman- 
tic similarities in gene sets. Finally, the GeneMAPP software package offers 
an excellent set of tools for drawing pathways based on GO terms which can 
then be highlighted based on gene behavior in a microarray dataset 
(Dahlquist et ah, 2002). 



7.2. Motif searching 

Motif searching can be used to identify potential ds-regulatory elements 
associated with coregulated gene products. MEME and AlignACE are two 
popular software packages which each offer Web-based submission (Bailey 
et ah, 2009; Hughes et ah, 2000). An in-depth discussion of motif searching 
algorithms by Hao Li appears in "The Guide to Yeast Genetics and 
Molecular Biology, Part B" (Li, 2002). 



7.3. Network visualization 

There are a number of tools available which allow one to visualize interac- 
tion networks, integrating microarray data, proteomic data and other 
sources of gene annotations. The open-source Cytoscape project is par- 
ticularly worthy of note for its excellent documentation, accessible learning 
curve, and active community. 



7.4. Graphing array data on genome tracks 

Analysis of tiling microarray data usually involves visualizing probe beha- 
viors on genomic tracks. The "genome browser" tool in the Gaggle 
project ' offers an easy-to-learn software solution for genomic visualization 
of microarray data. Once probe identifiers are associated with a chromo- 
some number and chromosomal coordinates, microarray data can be visua- 
lized with heatmaps, scatter plots, or line graphs on top of genome tracks 
drawn from the UCSC genome browser. A note of caution for those who 
usually use SGD as the source of genomic coordinates: the UCSC genome 
browser draws from the October, 2003 assembly, while SGD has continued 
to update the reference sequence. This means that there are slight incon- 
sistencies between the genome coordinate systems in these two databases. 

1 Q 

The SGD Genome Browser (GBrowse) offers a Web-based alternative 
which uses the current SGD coordinates. 
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http://bioconductor.0rg/packages/2.4/bioc/htmi/GOSemSim.htmi 

1 7 

http://www.cytoscape.org/ 
http://gaggle.systemsbiology.net/docs/geese/genomebrowser/ 
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8. Managing Data 



8.1. Data persistence and integrity 



One important, and easily overlooked, consideration when setting up a new 
microarray data analysis project in your lab is the best way to handle storage 
of microarray data and analysis results. Individual image and data files can be 
quite large, and data analysis pipelines typically produce a number of files 
across several stages. Ultimately, published array data will be archived in a 
public, off-site, MIAME-compliant database (see below), but how should 
the data associated with moderate- or large-scale microarray projects be 
handled within the lab? 

One approach is to use a relational database to store array data and 
provide a Web-based front-end for queries and data-submission (e.g., the 
UCSF NOMAD project). These solutions allow a working group to store 
array data in a centralized location, facilitating backups and maintenance. 
Modern, open-source, relational databases such as MySQL are capable of 
handling extremely large chunks of data, storing array images alongside data 
tables (such as image analysis files, probe information, clustering results, 
etc.), and provide efficient data querying facilities. 

Often, however, it is more convenient to work with microarray data as 
files on the local file system, rather than as tables stored in a relational 
database. Moving data to and from a database for use in other software 
packages can be cumbersome and confusing for those unfamiliar with these 
technologies. In my own work I have settled a convenient solution that 
works well for most types of projects: version control systems. 

Version control systems such as CVS and Subversion " were originally dev- 
eloped to facilitate software development projects involving many developers. 
Using Subversion to manage your microarray data files is fairly simple: you 
work with your microarray data files in a set of folders saved to your local 
machine and then synchronize the state of these folders with a Subversions 
server after each major change or manipulation. Version control systems such as 
Subversion offer several facilities which are extremely useful in a microarray 
data analysis project. Below we review some of these features, all of which 
should be taken into consideration when choosing a data storage strategy for 
you lab. 
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8.1.1. Data replication 

In Subversion parlance the files and folders you keep on your local machine 
are a "working copy" of a "repository" kept on the server. You can make as 
many working copies of your files on different computers as you need, each 
of which can be kept synchronized with the central repository. The concept 
here is similar to that of a fileserver, but Subversion is more useful in several 
ways. First, the working copy is actually a local copy of all of your files, so you 
do not need to be connected to a network to access your data. Second, it is up 
to you explicitly decide when you want commit local changes to the reposi- 
tory. This allows you to organize sets of changes into logical transactions. 

8.1.2. Multiuser support 

Many people can work with the microarray data files simultaneously using 
their own working copies of the archive. When users commit the changes 
they have made locally back to the repository, Subversion detects conflicts 
(cases where mutually exclusive changes have been made to a file) and offers 
a rich set of tools that help you to merge changes into a new version of the 
file. Subversion servers also allow you to setup user authentication (unique 
user names and passwords for different member of you lab) and set read/ 
write permissions on different portions of the archive. 

8.1.3. Transactions and logs 

Subversions servers save changes to files in the repository instead of copies of 
the files themselves. Each time you commit local changes in your working 
copy to the server, a log entry is created tracking all of the changes that were 
made. This means that it is possible to revert a working copy, or the central 
repository itself, to any previous version of the archive. Inevitably, when 
working on a data analysis project there will come a time when you are 
unsure of whether or not you have performed an analysis with the intended 
parameters or when you accidentally overwrite an important set of files. 
The version control system makes solving these problems straightforward: 
you can simply revert the repository to the last point in time when you 
know it was in a sane state. Subversion allows you to save a text log entry 
alongside every change to the contents of the repository. I have found this 
to be an incredibly powerful way to keep a record of data analysis projects. 
Log viewers are available which allow you to browse each of your log 
entries alongside the associated changes to files in the repository. 

8.1.4. Web-based access 

The Subversion community has developed a mature Apache web server 
module which makes it easy to "publish" your Subversion repository on the 
Web. I have found this to be an extremely convenient way to share data 
analysis files with off-site collaborators. 
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There are a number of version control systems available, each with 
slightly different performance characteristics and feature sets. Subversion is 
a well rounded, open-source, cross-platform solution which I have found 
works quite well for data analysis projects. Subversion is comprised of two 
software components: a client that needs to be installed on each computer 
where you want to make a "working copy" and a server that should be setup 
on a machine with reliable internet connectivity and a regular backup 
schedule. If you are using Linux, chances are quite good that both the 
Subversion client and server are already installed. Installers and binary files 
for other platforms, including Windows and Mac OSX, can be found on the 
Subversion web site. For windows users I would highly recommend both 
TortoiseSVN, * which integrates Subversion client facilities into Windows 
explorer, and VisualSVN which allows you to setup a Subversion server 
and web server using a simple installer. 



8.2. Public data repositories and MIAME compliance 

Community standards, and many journal submission agreements, require 
that microarray data presented in publications be submitted to public 
databases. Before the development of centralized microarray data hosting 
centers, many researchers posted microarray files on private lab web sites. 
This solution is problematic for several reasons. First, Web URLs are not a 
reliable resource: institutions often reorganize the URL structure of their 
web sites and labs can change institutional affiliation. Second, the structure 
and completeness of the available data can be extremely varied, making it 
difficult for interested third parties to recreate the published analysis or use 
the data in new studies. 

In response to inconsistencies in data-sharing practices in the microarray 
community, the Microarray Gene Expression Data Society * (or MGED) 
was formed to develop a standard describing the Minimal Information 
About a Microarray Experiment, or MIAME (Brazma et ah, 2001). 
The MIAME standard takes on the daunting task of articulating a formal 
set of data structures which describe a wide variety of different micro- 
array experiments and platforms. Because of this, the standard itself is 
extensive and relies on a relatively abstract nomenclature to remain platform 
agnostic. 

Fortunately there are a number of public, curated, MIAME-compliant 
databases designed to guide experimenters through the data annotation 
process. Three popular choices are: 



http://tortoisesvn.tigris.org/ 

http://www.visualsvn.com/server/ 

http://www.mged.org/ 
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GEO The Gene Expression Omnibus (GEO) (http://www. 

ncbi.nlm.nih.gov/projects/geo/) project developed 
and hosted at the NCBI is a free, full featured, database 
maintained by a responsive curatorial staff. GEO 
supports submission of a wide variety of gene 
expression datasets, offers a user-friendly Web-based 
submission system, and scales well to handle large 
submissions. GEO supports timed public release of 
datasets and the creation of private URLs which can be 
provided in the peer-review process. Finally, datasets 
stored in GEO are automatically searched when users 
enter terms in the "all databases" search box on the 
NCBI home page. 
ArrayExpress (http://www.ebi.ac.uk/microarray-as/ae/), 
developed and hosted at the EBI, is similar to GEO in 
scope and sophistication. There are some minor 
differences between the two sites in the searching 
facilities offered and the batch data upload file formats 
that are supported. 

SMD The Stanford Microarray Database (http://smd. Stanford. 

edu/index.shtml) offers several Web-based data 
analysis packages not available from other repositories. 
However, SMD charges a significant usage fee to labs 
outside of Stanford University. 



ArrayExpress 



For projects with a small number of arrays, submitting data to GEO is as 
simple as creating an account and following the Web-based guide to upload 
data files and provide MIAME-compliant annotations. GEO submissions 
are composed of three types of records: 



Platform 



Sample 



The platform record describes your microarray. This includes 
annotations describing the array substrate, manufacturing 
process, and probe sequences. If you are working with a 
common commercial array platform there is a good chance 
that a record has already been submitted for your array. If 
this is the case, you do not need to create a duplicate entry; 
you can simply skip this step and link your samples to the 
preexisting platform record. It is useful to attach your 
platform-specific probe/spot annotation table (GAL file, 
CDF, etc.) as a supplementary file on these records. 

Sample records describe each hybridization event in your 
microarray experiment. For single-sample designs you will 
create one sample record for each individual array in your 

(continued) 
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experiment. For competitive hybridization designs, you 
will also create one sample record for each array in your 
experiment, but in this case each sample record will actually 
describe two biological samples. The sample record 
combines annotations describing the origin and preparation 
of the biological sample hybridized on the array, along 
information about the hybridization conditions and data 
acquisition procedure. If your data preprocessing scheme 
involves a normalization step, it is common to provide the 
results of this normalization in the data table associated with 
each sample record. Attaching your platform-specific image 
analysis results (GPR file, CEL, SPOT, etc.) as a 
supplementary file will allow interested users to explore 
alternative normalization and preprocessing approaches. 

Each sample record points to a single platform record. 
Series The series record describes your study and the relationships 

among the associated sample records. This record is the 
main entry point for users interested in your dataset. The 
series record will allow you to describe the unique 
experimental factors examined in your study and the 
structure of your experimental replicates. Finally, the series 
page provides users with a set of links which allow them to 
download your data in a variety of formats. 

Each series record points to one or more sample records. 



All GEO submissions are reviewed by a curator before they are made 
public. This ensures that the submitted data and annotations meet a set of 
minimum standards. 

If your microarray project involves a large number of arrays, it may be 
too cumbersome and time consuming to manually submit data to GEO 
through the Web-based forms. For these cases, GEO supports a number of 
batch deposit formats. The simplest of these to use and understand is the 
SOFT format. SOFT files are plain text files which associate annotations, 
in header rows, with data, in tab-delimited tables. Each of the GEO record 
types (Platform, Sample, and Series) can be described in a SOFT file and 
each field that appears on the Web-based forms is given a corresponding 
label for use in SOFT file header rows. These files are relatively easy 
to produce with spreadsheet software or using simple scripts. Extensive 
documentation is available on the GEO web site. 



http://www.ncbi.nlrn.nih.gov/projects/geo/info/soft2.htrnl 
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Abstract 

Pre-mRNA processing is an essential control-point in the gene expression 
pathway of eukaryotic organisms. The budding yeast Saccharomyces cerevisiae 
offers a powerful opportunity to examine the regulation of this pathway. In this 
chapter, we will describe methods that have been developed in our lab and 
others to examine pre-mRNA splicing from a genome-wide perspective in yeast. 
Our goal is to provide all of the necessary information — from microarray design 
to experimental setup to data analysis— to facilitate the widespread use of this 
technology. 
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1. Introduction 

In the 30 years since Sharp and Roberts independently demonstrated 
the presence of split genes (Berget et ah, 1977; Chow et ah, 1977), it has 
become abundantly clear that pre-mRNA splicing and its regulation play an 
essential role in regulating gene expression in eukaryotic organisms (House 
and Lynch, 2008). By regulating the efficiency of splicing of specific 
transcripts during development, in specific tissues or in response to external 
stimuli, the expression levels of particular genes can be controlled. Similarly, 
splicing can control proteomic diversity by regulating splice site choice via 
the process known as alternative splicing. Remarkably, whereas the number 
of genes predicted to be encoded by the human genome has been steadily 
decreasing over the last 15 years, the fraction of genes known to be alterna- 
tively spliced has gradually increased over this same time and is now thought 
to be over 90% of genes in humans (Wang et ah, 2008). While great progress 
has been made in understanding the mechanistic details of splicing, many 
questions remain about the pathways used to regulate this process. 

It was recognized early on that the components of the spliceosome and 
the basic mechanisms of splicing are highly conserved from yeast to humans. 
As such, the budding yeast Saccharomyces cerevisiae has played a pivotal role as 
a model organism for elucidating mechanisms of pre-mRNA splicing. 
While the underlying machinery is highly conserved between humans and 
yeast, the genome-wide distribution of introns is in fact quite different. 
Whereas over 90% of human genes are interrupted by at least one intron, 
only about 5% of S. cerevisiae genes contain a functional intron. In spite 
of this simplified architecture, one of the earliest recognized examples of 
regulated splicing was demonstrated in S. cerevisiae. In an elegant set of 
experiments, Roeder's group demonstrated that the Mer2 protein specifi- 
cally modulates the splicing of the MER1 transcript during meiosis 
(Engebrecht et ah, 1991). The observation that a specific protein can play 
a pivotal role in the efficient splicing of a distinct set of transcripts highlights 
both the utility of splicing as a regulator of gene expression and also the need 
for genome-wide tools to assess global changes in splicing. 

This chapter focuses on methods for studying genome-wide changes in 
pre-mRNA splicing in yeast. The last several years have seen the develop- 
ment of several distinct but related microarray platforms that allow for 
global analysis of splicing (Clark et ah, 2002; Juneau et ah, 2007; Pleiss 
et ah, 2007a,b; Sapra et ah, 2004; Sayani et ah, 2008). In this chapter, we will 
describe one such methodology using short oligonucleotide sequences that 
specifically detect each of the different splicing isoforms. The first oligo- 
nucleotide-based microarrays that were used to specifically probe changes 
in splicing status were developed by the Ares lab (Clark et ah, 2002). 
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We have used this approach to identify splicing responses to environmental 
stress and to evaluate effects of mutations in core spliceosomal components 
(Pleiss et ah, 2007a,b). The goal of this chapter is to guide you through the 
platform that we are currently using in our lab. While the details presented 
here are specific to our particular platform, we expect that the protocols 
can be readily adapted to meet the requirements of other platforms. 




2. MlCROARRAY DESIGN 

The fundamental philosophy underlying splicing-sensitive microar- 
rays is no different from standard gene expression microarrays in that 
short, complementary oligonucleotides are used to probe the abundance 
of a given RNA species. Whereas gene expression microarrays have oligo- 
nucleotides that target only coding regions of genes, the splicing-sensitive 
microarrays that we use include additional probes to both intron regions and 
the junction of the two ligated exons to distinguish changes in pre- and 
mature mRNA levels, respectively (Fig. 3.1). Using tools that are described 
below, we have designed sequences that target approximately 6000 genes, 
~300 introns, ~300 junctions, tRNAs, snRNAs, snoRNAs, and other 
functional noncoding RNAs in the yeast genome. The sequences are 
readily available at the NCBI Gene Expression Omnibus (http://www. 
ncbi.nlm.nih.gov/geo/, Accession number GPL8154) and can be down- 
loaded and used by anyone to order microarrays from any of several 
different vendors. In this chapter, we will describe our work using custom 
Agilent microarrays which contain eight identical hybridization zones each 
printed with approximately 15,000 of these oligonucleotides. However, we 
see no reason why these sequences would not be transferrable to other 
commercial or homemade platforms. 

Pre-mRNA probe Total mRNA probe 

I J- 




Mature mRNA probe Total mRNA probe 

I i 



Figure 3.1 Microarray probe design. For each intron-containing gene, a minimum of 
three probes are designed. One measures changes in total RNA level by hybridizing to a 
region of the exon. A second measures changes in pre-mRNA level by hybridizing to 
a region of the intron. The third probe measures changes in mature mRNA levels by 
hybridizing to the junction of the two ligated exons. 
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If you are interested in learning how these microarrays are designed, 
continue reading the rest of this section. If not, order your microarrays and 
skip ahead to Section 3. As with all microarrays, a key component in the 
design of splicing-sensitive microarrays is the ability to probe the RNA of 
interest without cross-reacting with off- target or nonspecific RNAs. 
For probes that target the coding regions of genes (both intron-containing 
and intronless genes) this task is made relatively easy by numerous compu- 
tational programs that have been specifically designed for this purpose. 
We have used the OligoWiz program (http://www.cbs.dtu.dk/services/ 
OligoWiz/) to design 60 nt (nucleotide) probes to all ~6000 protein- 
coding genes in the yeast genome (Wernersson et ah, 2007). For intron- 
containing genes, where possible, probes were designed that target regions 
in both exons 1 and 2. However, because most yeast introns are located 
at the 5' end of the gene, and exon 1 tends to be very short, probes have 
been designed to target a region only in exon 2 for most intron-containing 
genes (Fig. 3.1). 

For intron-containing genes, OligoWiz can also be used to design 
probes targeting intronic regions. Because many yeast introns are short, 
we hoped to use shorter probe lengths to target the pre-mRNA species. 
However, it was unclear whether these probes would provide sufficient 
hybridization capacity. In our initial experiments with the Agilent platform, 
we tested both long (60 nt) and short (35 nt) probes targeted to the introns 
of all ^ 300 intron-containing genes and observed no significant loss of 
signal intensity when using the shorter probes. Therefore, all of the pre- 
mRNA specific probes on our microarray target a 35 nt region of the intron 
of interest. Likewise, because many functional RNAs like tRNAs and 
snoRNAs are also small, we designed probes to functional RNAs using 
35 nt sequences. 

From the perspective of specificity, the most difficult oligonucleotides to 
design are those that target the mature mRNA species by hybridizing to the 
junction between ligated exons. Whereas the exon-targeting probes can be 
optimized by moving the targeted region anywhere within the coding 
sequence, by definition the oligonucleotides which probe changes in 
mature mRNA levels by targeting the junctions of ligated exons are 
restricted to the discrete sequences at the end and beginning of those 
neighboring exons. In designing these probes, we sought to identify the 
shortest length oligonucleotide which was sufficient to efficiently capture 
spliced mRNAs. Because of the varying sequence content in the exons of 
different intron-containing genes, we chose to vary the length such that the 
sequences upstream and downstream of the junction are energetically 
balanced (Fig. 3.2). Our initial experiments tested a variety of thermody- 
namic stabilities for every exon— exon junction in the genome. In these 
experiments, the best compromise between signal intensity and signal 
specificity was found for those junction probes that had AG° values 
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YML056c 




AG° 


Exon 1 Exon 2 


AG° 


10.2 


CCAGTTACTG AAGACGGTAA 


10.8 


13.8 


TCCCAGTTACTG AAGACGGTAAGT 


13.8 


! 16.5 


CTTCCCAGTTACTG ! AAGACGGTAAGTGT 


17.0! 


18.8 


GCTTCCCAGTTACTG ; AAGACGGTAAGTGTC 


18.5 


22.6 


TGGCTTCCCAGTTACTG AAGACGGTAAGTGTCCA 

YML085c 


22.3 


AG° 


Exon 1 Exon 2 


AG° 


11.2 


TATTAGTATTAATG TCGGTCAAG 


10.4 


13.9 


GTTATTAGTATTAATG TCGGTCAAGCT 


14.2 


I 16.6 


AAGTTATTAGTATTAATG TCGGTCAAGCTG 


15.9 i 


19.6 


AGAAGTTATTAGTATTAATG TCGGTCAAGCTGGT 


19.5 


22.6 


AGAGAAGTTATTAGTATTAATG TCGGTCAAGCTGGTTG 

YML017w 


22.4 


AG° 


Exon 1 Exon 2 


AG° 


9.4 


GAAGAAATGG GAACAAATAATAC 


11.2 


13.6 


GGGAAGAAATGG GAACAAATAATACAT 


13.8 


16.4 


CGGGAAGAAATGG j GAACAAATAATACATCT 


16.8 


19.1 


AACGGGAAGAAATGG GAACAAATAATACATCTAAT 


19.8 


22.7 


GGAACGGGAAGAAATG GAACAAATAATACATCTAATAAT 22.8 



Figure 3.2 Design scheme for junction probes. For each intron-containing gene, a 
series of probes was created such that the hybridization energy derived from interac- 
tions with the upstream and downstream exons were thermodynamically balanced. 
For some genes, like YML056c, this yielded a nearly equal number of base pairs on 
either side of the junction. However, because of variable sequence content surrounding 
the junctions, other genes required longer base pairing regions either upstream 
(YML085c) or downstream (YML017w) of the exon-exon boundary. The boxed 
sequences correspond to the best performing probes in test hybridizations, and are 
included in the final microarray design. 

closest to 17 kcal/mol on each side of the junction (Sugimoto et ah, 1996). 
While these parameters were used to design junction probes specific to the 
5. cerevisiae genes, the thermodynamic properties are such that these para- 
meters are likely to be the optimal parameters for the design of junction 
probes for any organism. 

Finally, the architecture of the Agilent microarray platform is such that 
60 nt probes are printed with their 3' ends covalently linked to the glass slide 
surface. Because the lengths of intron and functional RNA probes are fixed at 
35 nt and our junction-specific probes vary from 24 to 36 nt, we included a 
stalk region at the 3' end of these oligonucleotides to move the "targeting 
region" of the probes away from the glass in hopes of making them more 
readily accessible for hybridization. Several different stalk designs were tested. 
We settled on a sequence designed by Agilent to have low cross-reactivity 
with any genomic sequence. As expected, our initial experiments comparing 
probes containing stalks with those lacking them indicated that the stalks 
provided improved signal intensity with little or no loss in probe specificity. 
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3. Sample Preparation 

The goal of this section is to describe all the steps needed to go from 
experimental design to hybridizing a sample on a microarray. Obviously, 
the details of experimental design will vary. For the purposes of this chapter 
we will describe a specific experiment comparing a yeast strain containing a 
temperature-sensitive mutation in a canonical splicing factor to a matched 
wild-type strain, which will be referred to as the experimental and reference 
strains, respectively, from here on. In our experience, the data from this type 
of experiment are most easily understood when a time course is followed 
after shifting the strains to the nonpermissive temperature. Defects in pre- 
mRNA processing can often be detected within minutes in an experiment 
like this. Below are the protocols for each of the major steps in the pathway: 
cell collection, RNA isolation, cDNA synthesis, fluorescent labeling, and 
microarray hybridization and washing. 



3.1. Cell collection 

The first step in an experiment is to collect appropriate cells from the 
experimental and reference strains. In our experience, micro arrays are 
exceptionally sensitive assays that are able to detect subtle differences in 
growth and handling of samples. As such, we always collect actively grow- 
ing cells in early to mid-log phase. Likewise, we work to standardize all 
experimental conditions to have equivalent volumes, flask sizes, growth 
media, shift conditions, etc. for both the experimental and reference strains. 
Our preferred method for harvesting cells is by vacuum filtration using 
mixed cellulose ester filters (Millipore Cat.#: HAWP02500, or equivalent) 
and a vacuum manifold apparatus (Millipore Cat.#: XXI 002500, or equiv- 
alent). After collection, the filters can be placed in a 15 ml conical tube and 
immediately frozen in liquid nitrogen. This method provides a fast and 
simple mechanism for collecting rapid time points during a time course. We 
have also collected cells by centrifugation at 5000 Xg for 5 min, but disfavor 
this method because of the time involved in getting cells from growth 
condition to frozen cells. 

The quantity of cells required for an experiment will depend upon your 
experimental conditions. In general, our protocol requires 40 fig of total 
RNA for both the experimental and reference samples. We routinely 
recover 20 fig of total cellular RNA from a single milliliter of cells grown 
in YPD with an optical density (OD) equal to 0.5 (~5xl0 cells). Our 
yields of total RNA are typically twofold less for cells grown in synthetic 
media to the same OD. 
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3.1.1. Protocol for cell collection 

(1) On day 1, start a 5 ml culture of both experimental and reference strains 
in YPD and allow them to grow overnight at the permissive tempera- 
ture (25 °C). 

(2) On day 2, for both strains, use the 5 ml overnight cultures to inoculate a 
fresh 50 ml culture to a starting OD of 0.1. Allow these to continue 
growing at 25 °C. 

(3) When the two cultures have reached the appropriate cell density 
(between an OD of 0.5 and 0.75, ~3— 4 h), collect 10 ml of cells by 
filtration. Immediately transfer the filter into a 15 ml conical tube, cap 
the tube tightly, and plunge it into liquid nitrogen. 

(4) Transfer both the experimental and reference culture flasks to a shaking 
water bath at the nonpermissive temperature (37 °C). Collect addi- 
tional 10 ml aliquots as described above after 5, 15, and 30 min. Cells 
can be stored at — 80 °C until ready for RNA isolation. 



3.2. RNA isolation 

In our experience, there are two factors that are crucial for efficiently 
isolating high-quality RNA for use in microarray experiments. The first 
critical factor is achieving efficient cell lysis. In our protocol, we affect cell 
lysis by using a combination of heat, exposure to phenol, exposure to SDS, 
and physical agitation. In our experience the most common reason for 
obtaining poor RNA yields results from insufficient vortexing during heat- 
ing (step 2 below). The second crucial factor is maintaining the integrity of 
the RNA. In this regard, we focus on two aspects: temperature and time. 
During all of the steps listed below, the samples should be handled on ice. 
Likewise, where possible, all centrifugation steps should be performed in a 
refrigerated centrifuge. Additionally, we strive to minimize the amount of 
time that elapses between taking the cells out of the — 80 °C freezer (step 1) 
and adding isopropanol to precipitate the RNA (step 8). Every effort should 
be made to move as expeditiously as possible until this point to ensure high- 
quality RNA. After the addition of isopropanol, the samples can be stored at 
— 20 °C indefinitely. 

An important improvement to both the reproducibility and integrity of 
our RNA preparations has come from the use of tubes containing Phase 
Lock Gel (5 Prime, http://www.5prime.com) to facilitate the separation of 
the organic and aqueous phases. Use of these tubes allows for nearly 
quantitative recovery of the aqueous phase and removes the inconsistency 
associated with manual aspirations at the interphase. Our protocols indicate 
3000 Xg spins to separate the aqueous and organic phases when using Phase 
Lock Gel, however, faster spins will give better separation. We routinely 
spin our samples in a Beckman X-15R centrifuge with a SX4750A rotor at 
top speed (5250 x^) at 4 °C for 5 min. 
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Figure 3.3 Typical banding pattern for purified total RNA. Two independent 
preparations of total cellular RNA have been separated by gel electrophoresis on a 
1% agarose gel run in 1 x TAE buffer. Loaded in lane 1 is 0.5 /ng of GeneRuler 100 bp 
DNA Ladder (Fermentas). Lanes 2 and 3 are the total RNA samples. The locations of 
the 25S rRNA, 18S rRNA, and tRNA species are indicated. 



The end of the RNA isolation procedure is the first opportunity to assess 
the quality of your experiment. We assess RNA quality in several ways. 
First, we determine the quantity of RNA isolated using a spectrophotome- 
ter and compare the results to the expectations described above. If the RNA 
isolation yield is significantly less than expected, we tend to discard this 
material, collect new cells and repeat the RNA isolation. Second, we 
examine the integrity of the isolated RNA by visualization on an agarose 
gel. While the quality of the mRNAs in the total RNA preparation cannot 
be directly assessed by the agarose gel, we use the bands corresponding to 
the ribosomal RNAs as a proxy for the integrity of the mRNAs. Figure 3.3 
shows a typical banding pattern seen when 1 /ig of RNA is separated on a 
1% agarose/ lx TAE gel. As an alternative, a higher resolution analysis of 
mRNA quality can be obtained using instruments such as the Agilent 
Bioanalyzer. It is perhaps worth a brief note that typical precautions should 
be used when handling RNA so as to avoid contamination with 
ribonucleases. 



3.2.1. Materials for RNA isolation 

15 ml Phase Lock Gel Heavy tubes (5 Prime Cat.#: 2302850) 

AES buffer (50 mM sodium acetate (pH 5.3), 10 mMEDTA, 1% SDS) 

Acid-phenol: chloroform (5:1) (pH < 5.5) (Ambion Cat.#: AM9720, or 

equivalent) 
Phenol: chloroform: IAA (25:24:1) (Ambion Cat.#: AM9730, or equivalent) 
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Chloroform 

3 M sodium acetate (pH 5.3) 

Isopropanol 

70% ethanol 



3.2.2. Protocol for RNA isolation 

(1) Remove conical tubes containing filters from the — 80 °C freezer and 
place on ice. Immediately add 2 ml of acid— phenol: chloroform, then 
add 2 ml of AES buffer and vortex well. Cells will be easily removed 
from the filters by vortexing. 

(2) Transfer the tubes to a 65 °C water bath and incubate for 7 min. 
Vortex thoroughly (5 to 10 seconds) once every minute. 

(3) Transfer the tubes to ice and incubate for 5 min. During incubation, 
prepare one 15 ml Phase Lock Gel Heavy tube for each sample by 
spinning briefly at > 3000 Xg. 

(4) Transfer the entire organic and aqueous contents to a prespun 15 ml 
Phase Lock Gel Heavy tube. Leave the filters behind if possible, but do 
not worry if they do transfer. Once the material is in the Phase Lock 
Gel Heavy tube, do not vortex. This fragments the Phase Lock Gel. 
Spin at >3000X£ at 4 °C for 5 min. 

(5) In the same 15 ml Phase Lock Gel Heavy tube, add 2 ml of phenol: 
chloroform: I AA to the supernatant. Mix by shaking, but do not 
vortex. Spin at >3000x^ at 4 °C for 5 min. 

(6) In the same 15 ml Phase Lock Gel Heavy tube, add 2 ml of chloroform 
to the supernatant. Mix by shaking, but do not vortex. Spin once 
again at >3000x^ at 4 °C for 5 min. 

(7) Prepare a new 15 ml conical tube with 2.2 ml of isopropanol and 
200 /A of 3 M sodium acetate. 

(8) Pour the supernatant from step 6 into the 15 ml conical tube with 
isopropanol. Mix by inverting several times. 

(9) Transfer 2 ml of the isopropanol slurry into a 2 ml microcentrifuge 
tube and spin the RNA at top speed in a microcentrifuge 
(> 14,000 x^) at 4 °C for 20 min. (The remainder of the RNA slurry 
can be stored at — 20 °C for future use as this is the most stable storage 
method for RNA.) 

(10) Carefully pour off the supernatant from the 2 ml tube so as not to 
disrupt the pellet. Add 2 ml of 70% ethanol to the pellet and mix by 
inverting several times. Spin again at top speed in a microcentrifuge 
(> 14,000 Xg) at 4 °C for 5 min. 

(11) Repeat step 10 once. 

(12) Carefully pour off the supernatant from the 2 ml tube so as not to 
disrupt the pellet, then briefly dry the RNA in a SpeedVac. Do not 
heat or overdry the samples. 
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(13) Dissolve the RNA in 50 fA of water. Determine the actual concentration 
using a spectrophotometer. Use a conversion factor of 1 ^260 — 40 fig/ 
ml RNA. This should give a concentration of approximately 2 mg/ml. 



3.3. cDNA synthesis 

After successful isolation of total RNA, the next step is to fluorescently 
label the experimental and reference samples for micro array hybridization. 
Several different methods exist, including direct RNA labeling (Wiegant 
et ah, 1999), generation of cRNA (Gelder et ah, 1990), or generation of 
cDNA (DeRisi et ah, 1997). It is worth noting that Agilent's protocols for 
gene expression micro arrays take advantage of a linear amplification method 
using T7 RNA polymerase to generate direct-labeled cRNAs. While there 
are advantages and disadvantages to each of these methods, for reasons 
described below our protocols are instead designed and optimized for 
cDNA synthesis. 

While traditional expression microarrays (and Agilent's gene expression 
protocols) use oligo-dT sequences to prime their cDNA (and cRNA) 
reactions, we instead use random 9-mer oligonucleotides. There are two 
main reasons for choosing this method. First, we are interested in looking at 
RNAs independent of their poly- (A) tail status. We presume that many pre- 
mRNA species may lack fully developed poly- (A) tails, and such species 
may be undetectable when priming with oligo-dT. The second reason is 
related to the fact that most S. cerevisiae introns are located at the 5 7 end of 
their transcripts. Therefore, to distinguish between pre- and mature mRNA 
species, cDNAs must be produced that correspond to the 5' end of the 
transcript, the efficiency of which is greatly reduced if all priming events take 
place at the poly- (A) tail. While random priming helps to alleviate this issue, 
the potential downside of random priming is the production of cDNAs 
corresponding to the highly abundant ribosomal RNA species. For a given 
intron-containing gene there are on the order of one million copies of 
rRNA for every single copy of pre-mRNA, thereby highlighting the 
need for highly specific probe design. Nevertheless, we find that the com- 
bination of our probe design and priming strategy does produce highly 
specific data. For example, as a measure of the potential cross-reactivity of 
the ribosomal cDNA species, we have examined microarrays comparing 
wild-type strains to strains containing complete deletions of intron-containing 
genes. Whereas robust signal is detected on the intron-specific probes for the 
wild-type strain, we find the signal intensity for the deletion strains are 
significantly reduced to levels near background. This suggests that there is 
minimal cross-hybridization of the ribosomal cDNAs to our specific probes. 

In general, there are two different methods by which fluorescent dyes 
can be incorporated into cDNA: either by inclusion of a fluorescently 
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labeled nucleotide analog which can be directly incorporated by reverse 
transcriptase, or by inclusion of a derivatized nucleotide analog containing a 
reactive chemical group which can be used to covalently attach fluorescent 
dyes in subsequent reactions. Our protocols utilize the latter method, 
including amino allyl-dUTP in the reverse transcription reaction, which is 
highly reactive to N-hydroxysuccinimidyl (NHS) ester derivatized fluoro- 
phores. The advantage to this method is that the aminoallyl-dUTP is 
efficiently incorporated by reverse transcriptase, whereas fluorescently 
labeled nucleotide analogs are often poor substrates for polymerases. 
Because cellular RNAs contain modified nucleotides with potentially reac- 
tive primary amines, our protocol includes an RNA hydrolysis step after 
cDNA synthesis to facilitate their removal. As will be described later, 
the cDNA purified from this protocol can be efficiently reacted with 
fluorescent dyes. 

The total amount of fluorophore incorporated into a sample is an 
important factor for optimal microarray signal and can be controlled by 
adjusting the ratio of amino allyl-dUTP to dTTP in the cDNA synthesis 
reaction. Our initial experiments using Agilent microarrays indicated that 
the specific fluorescent activity of the hybridized sample needed to be 
significantly lower than what has been typically used for spotted microarrays 
(DeRisi et ah, 1997; Pleiss et ah, 2007b). At high concentrations of amino- 
allyl-dUTP, we observed optical interaction between the two fluorophores, 
presumably because of the overlap between their emission and absorption 
spectra. This problem was alleviated by reducing the ratio of aminoallyl- 
dUTP to dTTP. The different requirements for Agilent and spotted micro- 
arrays presumably reflect the differences in the density and orientation of the 
oligonucleotide probes in these two formats. 

For both the experimental and reference samples, a single splicing 
microarray requires 20 fig of total RNA as starting material. We always 
perform our microarray experiments as technical repeats where the orien- 
tation of the dyes is reversed, resulting in so-called "dye-flipped" replicates. 
As indicated earlier, this means that 40 fig of total RNA are needed for both 
experimental and reference samples for a replicate set of hybridizations. 
We set up a single cDNA synthesis reaction for the entire 40 fig of total 
RNA, which will be divided later for fluorescent labeling and hybridization. 
Important considerations in setting up the cDNA reactions are the concen- 
tration of total RNA and random primers. Our best results have been 
achieved using a final concentration of total RNA at or below 0.5 mg/ml 
and random primers at 0.25 mg/ml. At RNA concentrations higher than 
this, the efficiency of cDNA synthesis drops off significantly. 

For the protocol listed below we purify recombinant MMLV reverse 
transcriptase and make our own buffers. However, commercial enzymes 
can also be used. In considering different commercial enzymes it is impor- 
tant to use an enzyme like Superscript Reverse Transcriptase (Invitrogen) 
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that has the RNaseH activity disrupted. However, most commercial 
enzymes are packaged with a reaction mix that contains buffer, salt and 
MgCl 2 together. When we hybridize our primers to our RNA, we find that 
it is important to leave out the MgCl 2 to avoid stabilizing RNA structures 
and therefore recommend using homemade buffers for this step. Under the 
conditions described below we typically achieve cDNA synthesis yields that 
are between 30% and 50% conversion by mass (12—20 fig of cDNA from 
40 fig of starting RNA). As this is a second opportunity for quality control, 
yields significantly below this expectation warrant a repeat of the cDNA 
synthesis step or there will be insufficient signal for the microarray 
experiment. 

After a successful cDNA synthesis reaction it is important to purify the 
products away from any unincorporated amino allyl-dUTP prior to fluores- 
cent labeling. In the protocol below we use a commercial kit from Zymo 
Research that is designed to purify oligonucleotides from unincorporated 
dNTPs. An alternative that we have found to be both cost-effective and 
high quality is to use 96-well Glass Fiber DNA binding plates. Such plates 
are made by several manufacturers and are widely available. We have 
commonly used plates from Nunc (Cat.#: 278010) along with homemade 
cDNA binding buffer (5 M guanidine— HC1, 30% isopropanol, 90 mM 
KOH, 150 mM acetic acid) and wash buffer (10 mM Tris-HCl (pH 8.0), 
80% ethanol). 

3.3.1. Materials for cDNA synthesis 

Total RNA for experimental and reference samples 

10 x RT buffer = 0.5 M Tris-HCl (pH 8.5), 0.75 MKC1 

10 X dN 9 = 5 mg/ml dN 9 oligonucleotides 

10 x MgCl 2 = 30 mMMgCl 2 

lOx DTT = 0.1 MDTT 

10X dNTP's (+aa-dUTP) = 10 mM ATP, 10 mM CTP, 10 mM GTP, 

9.8 mMTTP, 0.2 mM aminoallyl-dUTP 
Reverse transcriptase 

RNA hydrolysis buffer = 0.3 MNaOH, 0.03 MEDTA 
Neutralization buffer = 0.3 MHC1 
DNA Clean and Concentrator — 25 kit (Zymo Research, Cat.#: D4006) 

3.3.2. Protocol for cDNA synthesis 

Using the volumes in the protocol listed in Table 3.1 as a guide, but adjusted 
for actual RNA concentrations, do the following: 

(1) Anneal the primers to the total RNA by heating in a 60 °C water bath 
for 5 min. (Note that at this stage the RNA and primer concentrations 
(1 and 0.5 mg/ml, respectively) are twice the values that they will be 
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Table 3.1 Experimental setup for cDNA synthesis reaction 





Stock 


Volume 


Volume used per 


Reagent 


concentration 


used 


(til) 


8 rxns (^1) 


RNA /primer mix 










RT buffer 


lOx 


5 






dN 9 


5 mg/ml 


5 






Total RNA 


2 mg/ml 


20 






Water 




20 






Total 




50 






Enzyme /dNTP mix 










RT buffer 


lOx 


5 




44 


MgCl 2 


lOx 


10 




88 


DTT 


lOx 


10 




88 


dNTP's (+aa-dUTP) 


20 x 


5 




44 


Reverse transcriptase 


20 x 


5 




44 


Water 




15 




132 


Total 




50 




440 



in the final cDNA synthesis reaction. Note also that MgCl 2 is omitted 
from this step, but buffer and salt are included, which we find increases 
the yield by about 20% relative to annealing in water alone.) 

(2) Immediately after heating the samples, transfer them onto ice for an 
additional 5 min to allow the primers to anneal. 

(3) While the RN A/primer mix is cooling, make the enzyme /dNTP mix 
as described in Table 3.1. 

(4) Add 50 fi\ of enzyme/dNTP mix to the annealed RNA/primer mix 
(the RNA should now be at its final concentration of 0.5 mg/ml). 
Briefly vortex and spin the tubes, then allow them to incubate at 
42 °C. In our experience the reaction is >90% complete after 2 h; 
however, we routinely incubate the samples overnight. 

(5) To hydrolyze RNA prior to purification of the cDNA, add 50 fi\ of 
RNA hydrolysis buffer, vortex, and spin down. Place this mix in a 
60 °C water bath for 15 min, then transfer to ice. 

(6) Neutralize the solution by adding 50 jA of neutralization buffer. 
Vortex and spin down. 

(7) Each column in a DNA Clean and Concentrator — 25 kit can bind a 
maximum of 25 fig of cDNA. Therefore, a single cDNA reaction, 
which starts with 40 fig of total RNA and yields about 20 fig of 
cDNA, can be purified with a single column. Follow the manufac- 
turer's instructions for purification of single stranded cDNA until the 
elution step. Proceed with elution as described in step 8. 

(8) Transfer the column to a clean 1.7-ml tube. Add 35 fi\ of water 
directly onto the filter. Wait 30 s, then spin the samples in the 
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micro fuge at top speed (> 1 4,000 x^) for 1 min (spinning at 10,000 Xg 
is effective for elution and can help to prevent the lids of the collection 
tubes from snapping off). 
(9) Add a second 35 /A aliquot of water directly onto the filter, wait 
another 30 s, and spin the samples in the microfuge at top speed 
(> 14,000 Xg) for 1 min. 

(10) After discarding the column, vortex the eluate well and spin down. 
Quantitate cDNA yield using a spectrophotometer. Although this 
is a cDNA sample, we continue to use a conversion factor of 
1 A 2 6o = 40 /ig/ml cDNA. Using this conversion factor, expect to 
recover between 12 and 20 /ig of cDNA from 40 fig of starting RNA. 

(11) Split the eluate into two equal aliquots (~33 jA) in 1.7 ml tubes and 
dry in the SpeedVac. These samples will subsequently be labeled with 
the two different dyes to be used as matched replicates for the 
microarray. 



3.4. Fluorescent labeling of cDNA 

Many different vendors sell fluorescent dyes which have both the appropri- 
ate spectral properties and the appropriate derivatizations to react with the 
primary amine of the aminoallyl modified nucleotide. We have used 
both Cy dyes from GE Healthcare (Cy3 and Cy5) and Alexa dyes from 
Invitrogen (Alexa 555 and 647), and have found largely overlapping results. 
We presume that similar dyes from other vendors could also be used. 
A major concern for both choosing and handling these fluorescent dyes is 
that there is ample evidence in the literature that significant oxidation of 
both Cy5 and Alexa 647 can result from the levels of ozone that are 
commonly present in the air. Methods for mitigating ozone levels will be 
discussed in Section 3.6. A possible alternative is to use a new dye from GE 
Healthcare called Hyper5 which is a modified version of Cy5 that is 
reportedly stable to ozone (Dar et ah, 2008). 

A single tube of NHS ester derivatized Cy3 or Cy5 contains a sufficient 
amount of fluorophore to label 16 cDNA samples. Because the NHS ester is 
highly unstable we do not store opened dye packages. Therefore, the actual 
volume of DMSO that we use to dissolve a single dye aliquot is determined 
by the number of samples. For example, in this protocol where eight 
different hybridizations are being performed (a four-point time course 
with dye-flipped replication), a single dye pack should be dissolved in 
42 jA of DMSO (enough for eight 5 jA aliquots). 

3.4.1. Materials for fluorescent labeling of cDNA 

0.1 M sodium bicarbonate (pH 9.0) 
DMSO (Fluka, Cat.#: 41647) 
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Cy3 NHS ester (GE, Cat.#: PA23003) 
Cy5 NHS ester (GE, Cat.#: PA23005) 
DNA Clean and Concentrator — 25 kit (Zymo Research, Cat.#: D4006) 

3.4.2. Protocol for fluorescent labeling of cDNA 

(1) Dissolve dried down cDNA in 5 fA of 0.1 M sodium bicarbonate. 
To ensure that all of the cDNA is resuspended in this small volume, 
vortex well and spin down several times. 

(2) Dissolve the Cy3 and Cy5 dyes in 42 fA of DMSO. Because it is difficult 
to see whether all of the dye is dissolved, vortex well and spin down 
several times. 

(3) To one of the two tubes of experimental cDNA, add 5 fA of Cy3 
dissolved in DMSO. To the other tube of experimental cDNA add 5 fA 
of Cy5 dissolved in DMSO. Do the same for your reference cDNA 
samples. 

(4) Incubate the reactions in the dark in a 60 °C water bath for 1 h. While 
many dye labeling protocols incubate at room temperature, we observe 
a significant increase in labeling efficiency at elevated temperatures. 

(5) To the 10 fA labeling reaction add 100 fA of DNA binding buffer from 
the DNA Clean and Concentrator kit, and then proceed according to 
manufacturer's instructions until the elution step. Proceed with elution 
as described in step 6. 

(6) Transfer the column to a clean 1.7 ml tube. Add 35 fA of water directly 
onto the filter. Wait 30 s, then spin the samples in the microfuge at top 
speed (>14,000x^) for 1 min (spinning at 10,000X£ is effective for 
elution and can help to prevent the lids of the collection tubes from 
snapping off). 

(7) After discarding the column, quantitate cDNA yield using a spectro- 
photometer. Continue to use a conversion factor of 1 v4 2 6o — 40 /ig/ml 
cDNA. Using this conversion factor, expect to recover > 50% of the 
cDNA that was included in the labeling reaction. Note that it is in 
theory possible to monitor Cy3 and Cy5 incorporation at this step; 
however, with this protocol the expected absorption levels for these 
dyes are close to background. 

(8) Pool the appropriate Cy3- and Cy5-labeled samples for each hybridiza- 
tion, and dry in the SpeedVac. Important note — make sure that the 
appropriate samples are combined at this step. This is the easiest 
moment to ruin a great experiment. For example, the Cy3-labeled 
experimental sample should be combined with the Cy5-labeled refer- 
ence sample, and vice versa. 

(9) After the samples have dried in the SpeedVac, it is important to 
resuspend the samples in water quickly, because Cy5 is highly sensitive 
to ozone. Resuspend each pellet in 25 fA of water. Vortex well to 
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ensure proper mixing. We typically proceed immediately to hybridiza- 
tion. But if necessary, samples can be flash frozen in liquid nitrogen and 
stored in the dark at — 80 °C indefinitely. 



3.5. Microarray hybridization 

The Agilent Custom 8xl5K Microarrays are a single piece of 1x3 in. 
microscope glass with eight different hybridization zones. The individual 
hybridizations are kept separated by use of a matched gasket slide which, 
when sandwiched with the microarray slide, creates eight distinct hybridi- 
zation compartments. All eight compartments need to be simultaneously 
hybridized. A detailed description of the microarray architecture and 
instructions for their use accompanies each microarray order and should 
be used as a supplement to the protocols described here. 

Because dust is highly fluorescent, it is important to keep dust to a 
minimum during the hybridization procedure. We minimize the time that 
the microarray surfaces are exposed to air and always work on clean surfaces. 
Likewise, gloves should always be worn to avoid contamination of the glass. 
When possible the glass slides should be handled with tweezers. When this 
is not possible, the glass slides should only be handled by their edges. 

3.5.1. Materials for hybridization 

Agilent Custom 8x 15K Microarray 
2x hybridization buffer (Agilent, Cat.#: 5190-0403) 
Eight chamber gasket slides (Agilent, Cat.#: G2534-60014) 
Hybridization chamber (Agilent, Cat.#: G2534A) 
Hybridization oven (Agilent, Cat.#: G2545A) 

3.5.2. Protocol for hybridization 

(1) Heat samples to 95 °C for 2 min. Place in a drawer or dark box for 
5 min to cool, then spin down briefly. 

(2) To each of the samples add 25 jA of 2 X Agilent hybridization buffer. 
Mix by gently pipetting up and down. DO NOT VORTEX as this 
introduces bubbles that are problematic for the hybridization. 

(3) Place a gasket slide in a hybridization chamber with the gaskets facing up. 

(4) Load 40 fA of each sample into the appropriate gasket section. Avoid 
pipetting bubbles. To reduce the evaporation of the samples that are 
loaded first, the time to load eight samples should be minimized. 

(5) After all eight sections have been loaded, carefully place the microarray 
slide onto the gasket slide. Note that it is important to ensure that the 
printed side of the microarray slide is exposed to the sample (see Agilent 
protocol). 
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(6) Close the hybridization chamber. Once assembled, rotate the entire 
chamber two or three times to wet the gasket linings. Ensure that any 
bubbles within the hybridization compartments can rotate freely. 
Gently tapping the chamber may help to release any stuck bubbles. 

(7) Hybridize in a rotating oven at 60 °C for 16 h. In our experience 
hybridization times ranging from 14 to 18 h produce similar results. 



3.6. Microarray washing 

Prior to scanning the microarray, unhybridized cDNAs must be washed off 
of the glass surface. The design of the probes on the microarray is optimized 
for specificity at 60 °C. Because nonspecific species will begin to cross- 
hybridize at lower temperatures, an important consideration during the 
washing steps is minimizing the time between removing the microarray 
from the hybridization oven and washing away any unbound cDNAs. 

A second important consideration during microarray washing is the 
capacity of atmospheric ozone to oxidize Cy5 dyes. This oxidation potential 
is most acute on a dried microscope slide after hybridization and washing. 
Several different mechanisms have been developed to mitigate the effects of 
ozone. The solution we chose was to create a chamber where ozone can be 
specifically removed from the air. Prebuilt chambers are commercially 
available which are of sufficient size to house a microarray scanner (Scigene, 
NoZone Workspace). Alternatively, ozone scavenging filters are available 
(Ozone Solutions, NT-40) which can be used to remove ozone from any 
chamber. We have built a simple chamber in our lab using Plexiglass which 
has sufficient working space for wash dishes and our microarray scanner 
(15 cubic feet). If changing the infrastructure in your lab is not a possibility, 
other options exist. For example, Agilent has developed a stabilizing wash 
solution (Agilent, Cat.#: 5185-5979). Likewise, Genisphere has developed 
a coating solution that stabilizes the optical properties of the Cy5 dye 
(Genisphere, Cat.#: Q500500). 

3.6.1. Materials for microarray washing 

Glass washing dishes with slide racks (Thermo, Shandon Complete Staining 

Assembly 121) 
Wash I = 6x SSPE and 0.005% sarcosyl (or Agilent, Cat.#: 5188-5325) 
Wash II = 0.06X SSPE and 0.005% sarcosyl (or Agilent, Cat.#: 5188-5326) 



3.6.2. Protocol for microarray washing 

(1) Prepare three glass wash dishes, two with Wash I (one without a slide 
rack, one with a slide rack), and one with Wash II (with a slide rack). 
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(2) Remove the hybridization chamber from the hybridization oven. 
Open the chamber such that the sandwich of the microarray and gasket 
slides remains intact. Quickly transfer sandwiched slides to the Wash I 
dish with no rack. With the sandwich completely submerged and 
supported with one hand, use tweezers to gently pry the two slides 
apart, allowing the gasket slide to drop to the bottom of the glass dish. 
Hold the microarray slide with your fingers being careful to touch only 
the stickers or sides of the glass. 

(3) Keep the microarray slide submerged in Wash I and swish the micro- 
array slide back and forth two or three times to remove most of the 
unhybridized sample. 

(4) Quickly transfer the microarray slide into the rack in the second Wash I 
dish, taking care to minimize exposure of the microarray to air. 

(5) Vigorously agitate the rack up and down for 1 min. 

(6) Transfer the Wash I dish with microarray into the ozone-free chamber. 
Place Wash II glass dish with rack in ozone-free chamber. Quickly 
transfer the microarray slide from Wash I to Wash II. 

(7) Gently agitate slide rack up and down, ensuring that all bubbles are 
washed off the microarray slide. 

(8) With a pair of tweezers, grab the microarray slide by a corner with the 
stickers. Slowly remove the slide from Wash II. If this is done slowly 
enough (over 10 s) the microarray will come out dry because of the 
sheeting qualities of the wash solution. 

(9) The microarray can either be scanned immediately or can be put into a 
dark box and protected from ozone until ready to be scanned. 




4. Microarray Data Collection 

The Agilent microarrays are printed on standard microscope glass 
(1x3 in.) and can be analyzed using any scanner than can accommodate this 
format. The features on an Agilent Custom 8x 15K Microarray are approxi- 
mately 60 /im in diameter. We get the highest quality data when we scan at a 
pixel size of 5 /im; this yields about 100 pixels of data for each feature. There 
are several companies that manufacture microarray scanners, including Agi- 
lent, Molecular Devices, and Tecan, which are capable of scanning at this 
resolution. We use an Axon 4000B for all of our data collection (Molecular 
Devices). The instruments listed above all use lasers to excite the Cy3 and Cy5 
dyes and photo multiplier tubes (PMTs) to quantitate fluorescence intensity at 
each spot. Below are some general guidelines to facilitate microarray scanning, 
but because each of these instruments has different parameters, user guides 
should be consulted for specific details. 
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An important assumption underlying microarray experiments is that 
the total global signal is unchanged between the experimental and refer- 
ence samples with only a small subset of features showing change in expres- 
sion behavior. As such, an important consideration when scanning the 
microarrays is to equalize the total fluorescent signal intensity from the Cy3 
and Cy5 samples. In the early days, this required a painstaking process of 
manually adjusting the PMT gain settings. Fortunately, newer versions 
of scanner control software allow the user to largely automate this step. 

Whereas new software has improved the ability to equalize Cy3 and Cy5 
signal, another important consideration is maximizing total signal intensity. 
For most microarray scanners the dynamic range is limited to about three 
orders of magnitude. By comparison, the difference in abundance of a rare 
pre-mRNA species and an abundant mature mRNA species can easily be 
greater than three orders of magnitude, meaning that no single scanning 
condition can generate reliable data for every RNA species. Agilent's 
microarray scanners automate the process of scanning each microarray 
twice; once using the lasers at full power and once at reduced power 
settings. The Agilent software then integrates the data from these two 
scans to increase the dynamic range of the experiment. In our experience, 
we have empirically identified conditions that maximize data collected from 
just a single scan. On our Axon 4000B, using the built in software, we set a 
saturation tolerance level equal to 0.1%. Using these settings and with the 
PMT gain values between 500 and 600 for both channels, only the most 
highly abundant RNA species in the cell are oversaturated, yet robust signal 
can be detected for the rare pre-mRNA species. 




5. Microarray Data Analysis 

In this section, we will divide the tasks for data analysis into two 
general parts: one is the technical details for processing the scans from the 
previous section, and the second is deriving biological meaning from an 
experiment. The first step is to extract quantitative measurements for both 
experimental and reference samples for each of the ~ 15,000 features for 
all eight hybridizations. For this step in the data analysis pathway, processing 
a splicing-sensitive microarray is no different than processing a standard 
gene expression microarray. A description of the tools necessary for extract- 
ing data from a microarray experiment can be found in Chapter 2 of this 
volume, and should be consulted for this and subsequent sections. Included 
in that chapter are both the descriptions of the software that can be used and 
instructions for their implementation. 
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5.1. Data normalization 

Having successfully extracted quantitative information for all of the features 
on the microarray, the next step is to mathematically normalize the data 
derived from the experimental and reference samples. As described in 
Chapter 2 of this volume, the Bioconductor package of data analysis 
software (Gentleman et ah, 2004) provides several different options for 
normalizing microarray data. We have compared the output when analyz- 
ing a single microarray using several different normalization algorithms and 
have found that the LOWESS normalization algorithm does the best job at 
addressing nonlinear behavior often seen in microarrays (Yang et ah, 2002). 
Therefore, we use the maNorm package within Bioconductor to imple- 
ment LOWESS normalization across all of our data. The output from this 
analysis is a single value corresponding to the log 2 transformed ratio of the 
Cy5 intensity to the Cy3 intensity for all 15,000 features. 



5.2. Replication 

Of the ~ 15,000 features present on our splicing-sensitive Agilent micro- 
arrays there are ~7000 unique features, each of which is replicated at either 
two or three distinct locations on each microarray. The next step in 
analyzing the data is to collapse these replicate data into a single averaged 
value. A spreadsheet program such as Microsoft Excel can be used to 
calculate averages as well as coefficients of variation for these measurements. 
Having compressed the data to a single value for each unique feature, the 
next task is to compare and average the values determined between the dye- 
flipped replicate experiments. It is important here to repeat that the standard 
output from the maNorm package is always presented as a ratio of Cy5 to 
Cy3 intensity. According to the design of our experiment, one microarray 
compares Cy5-labeled experimental sample with Cy3-labeled reference 
sample, whereas its corresponding replicate compares Cy5-labeled refer- 
ence sample with Cy3-labeled experimental sample, which for the purposes 
of this section we will refer to as "forward" and "flip" experiments, 
respectively. Because of their orientations, the data output from the maN- 
orm package for the "forward" and "flip" experiments are expected to be 
negatively correlated. Therefore, to determine the average behavior 
described by the dye-flipped experiments the "flip" value must be multi- 
plied by — 1 prior to averaging. At this point, the value associated with 
each of the ^7000 unique features represents a composite value incorpor- 
ating replication within a single microarray and between dye-flipped micro- 
arrays and constitutes as many as six independent measurements for each 
time point. 
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5.3. Splicing specific data 

While no further steps are necessary to analyze the behavior of intronless 
genes, analyzing intron-containing genes requires additional steps. Whereas 
just one value is needed to describe the behavior of an intronless gene, even 
the simplest intron-containing gene has a minimum of three values asso- 
ciated with it, corresponding to the total, pre- and mature mRNA probes. 
None of these metrics by themselves is sufficient to understand the splicing 
behavior of a given transcript, but rather they must be considered in relation 
to one another. In their original experiments, the Ares group addressed this 
issue by creating splice junction and intron accumulation indexes; they 
divide the values for both the mature probe and the pre-mRNA probe 
by the value of the total mRNA probe, respectively (Clark et ah, 2002). By 
comparison, we have chosen to analyze the data by concurrently examining 
the behaviors of each of the three individual metrics (Pleiss et ah, 2007a,b). 
Database software such as Microsoft Access can be used to associate the 
total, pre- and mature mRNA specific probes with one another for a given 
intron-containing gene. 



5.4. Extracting biological meaning 

Congratulations! Having successfully collected experimental samples, 
isolated RNAs, converted RNA to cDNA, labeled and hybridized the 
cDNA to microarrays, extracted and normalized the data, you have now 
made it to the hard part — understanding the biology underlying your 
experiment. For traditional gene expression experiments, this is rather 
straightforward in the sense that one can look for transcripts that show 
either increased or decreased abundance. Those transcripts showing 
increased abundance either reflect genes whose transcription has increased 
or whose degradation has decreased. From the perspective of splicing, it is 
more difficult to describe what a defect should look like. A simple expecta- 
tion might be that a defect in splicing would lead to accumulation of pre- 
mRNA levels with a concomitant decrease in mature mRNA levels. This 
expectation, however, presumes nearly equal steady state levels of pre- and 
mature mRNA. For genes that are efficiently spliced, where the mature 
mRNA level is much greater than the pre-mRNA level, a defect in splicing 
could be expected to show an accumulation of pre-mRNA with little or no 
change in mature mRNA. Our experience in analyzing these types of data 
demonstrates that both of these profiles can be seen among the different 
intron-containing genes in S. cerevisiae. As such, several different descriptors 
exist which may be used to identify a transcript whose splicing is altered 
in response to an experimental condition. 

With these ideas in mind, the next challenge for any experiment is 
finding the important biological changes among the sea of data collected 



72 Maki Inada and Jeffrey A. Pleiss 

in a microarray experiment. Without a doubt, this is the most challenging 
part of the microarray experiment because no single approach to interrogat- 
ing the data will identify all of the genes whose behavior is modified. 
Rather, we find the best way to identify these genes is to look at the data 
from many different perspectives. For example, in our experience compar- 
ing data across an experimental time course is a powerful way to identify 
genes that are responding to an experimental condition. Two important 
software packages we use to organize and visualize our data, Cluster (Eisen 
et ah, 1998) and Java Treeview (Saldanha, 2004), are described in the 
Chapter 2 of this volume. We find that Cluster works particularly well for 
organizing splicing-specific information. For example, Fig. 3.4 shows the 
results of a time course examining a mutation in the canonical splicing factor 
Prpl6 versus a wild-type reference. By concurrently examining the behav- 
ior of the total, pre-, and mature mRNA, transcripts can be identified 
whose splicing is affected by the experimental condition. This figure 
demonstrates the variety of behaviors possible for different transcripts, as 
described above, demonstrating the global patterns that result from a defect 
in splicing. 




6. Future Methodologies 

Splicing-sensitive microarrays are a powerful tool for examining 
genome-wide changes in pre-mRNA splicing. However, as with all micro- 
array technologies, the advent of high-throughput, short-read sequencing 
technologies promises to change the way splicing is studied from a genome- 
wide perspective (Wold and Myers, 2008). In theory, these short-read 
sequencing methodologies have an advantage over microarray technologies 
in that they take an unbiased approach to the experiment. Because micro- 
arrays require probes be designed to target- specific RNAs, they are by 
nature poor at discovering previously uncharacterized species. By directly 
sequencing total cellular RNA, short-read sequencing methodologies 
should be able to identify both previously uncharacterized RNAs and 
novel splicing events. Nevertheless, many of the same challenges that the 
splicing-sensitive microarray community faced must now be resolved in the 
context of short-read sequencing methodologies. For example, the most 
widely used current methods for sequencing cellular RNAs utilize poly- (A) 
selection schemes to remove ribosomal RNAs from the pool of sequenced 
samples. For the same reasons described at the beginning of this chapter we 
think it is likely that many of the interesting RNA processing events happen 
independent of the poly- (A) status of the RNA. Until such time as these 
methodologies have been developed for the sequencing technologies, 



Splicing-Sensitive Microarrays 



73 



Total Pre- Mature 

mRNA mRNA mRNA 



Time 



<D 

fl 
CD 
OX) 

bJO 

C 

• i-H 

fl 

■ i-H 

KS 

■4-» 

o 
o 

fl 

o 




a 



bJO 
O 







-3 

Figure 3.4 Genome-wide changes in pre-mRNA splicing. Results are presented from 
an experiment comparing a strain containing a cold-sensitive prp 16-3 02 mutation with a 
matched wild-type strain as both were shifted to the nonpermissive temperature. Data 
are shown from unshifted samples (grown at 30 °C), as well as after 10 and 60 min of 
incubation at 16 °C. Each horizontal line represents the behavior of a single intron- 
containing gene during this time course. Notice that some genes (indicated with a red 
bar) show a dramatic increase in pre-mRNA level with very little change in mature 
mRNA level, whereas other genes (indicated with a green bar) show a strong increase 
in pre-mRNA level concomitant with a strong decrease in mature mRNA level. 



splicing-sensitive microarrays will continue to be a fast, cost-efficient, and 
effective way to examine genome-wide changes in pre-mRNA splicing. 
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Abstract 

Much of eukaryotic gene regulation is mediated by binding of transcription 
factors near or within their target genes. Transcription factor binding sites 
(TFBS) are often identified globally using chromatin immunoprecipitation 
(ChIP) in which specific protein-DNA interactions are isolated using an antibody 
against the factor of interest. Coupling ChIP with high-throughput DNA 
sequencing allows identification of TFBS in a direct, unbiased fashion; this 
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technique is termed ChlP-Sequencing (ChlP-Seq). In this chapter, we describe 
the yeast ChlP-Seq procedure, including the protocols for ChIP, input DNA 
preparation, and lllumina DNA sequencing library preparation. Descriptions 
of lllumina sequencing and data processing and analysis are also included. 
The use of multiplex short-read sequencing (i.e., barcoding) enables the analy- 
sis of many ChIP samples simultaneously, which is especially valuable for 
organisms with small genomes such as yeast. 




1. Introduction 

The number of completely sequenced genomes has increased dramat- 
ically with improvements in DNA sequencing technologies, and the deter- 
mination of coding sequences both in vivo and in silico has identified novel 
genes within these newly sequenced genomes (Aparicio et al., 2002). 
First, understanding gene regulation requires more than just knowing 
their genomic sequence. Second, one must identify the repertoire of tran- 
scription factors present. Coding sequences of transcription factors are often 
conserved in the course of evolution (Borneman et ah, 2006), allowing their 
discovery in many cases by comparison to homologous transcription factors 
from closely related organisms (Frazer et ah, 2004). Third, it is crucial to 
establish a list of regulated genes (target genes) for each transcription factor. 
Computational searches for particular transcription factor DNA binding 
motifs upstream of putative target genes have been useful tools to obtain 
such a list, although these predictions require experimental validation in vivo 
(Tompa et ah, 2005). Moreover, the presence of a consensus binding motif 
is not always directly linked to transcription factor binding, as many perfect 
motifs are not bound by a transcription factor whereas some imperfect 
motifs are bound under the same environmental conditions (Borneman 
et ah, 2007; Martone et ah, 2003). Fourth, transcription factors can regulate 
genes subjugated to multiple cellular environments and stresses (Harbison 
et ah, 2004). Global characterization of binding sites of a single transcription 
factor, therefore, demands multiple experiments in different conditions as 
well as profiling across the whole genome, making such studies labor- 
intensive. Large international consortiums, such as ENCODE in humans 
(Birney et ah, 2007) and modENCODE in Drosophila melanogaster and 
Caenorhabditis elegans (Celniker et ah, 2009), aim to characterize every 
functional DNA element across the whole genome, and the study of 
transcription factor binding represents a major part of these efforts. 

In Saccharomyces cerevisiae, there are approximately 200—300 described 
transcription factors (TFs) among the ~6000 predicted ORFs (Costanzo 
et ah, 2000). Direct analysis of transcription factor binding upstream of 
target genes was performed initially using DNase footprinting (Axelrod 
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and Majors, 1989) and/or PCR quantification of DNA associated with an 
immunoprecipitated transcription factor, a procedure called chromatin 
immunoprecipitation (ChIP) (Kuo and Allis, 1999; Orlando et ah, 1997). 
These methods could only analyze a few promoters at a time, making the 
comprehensive discovery of unexpected, novel TF-bound DNA elements 
unrealistic. The development of DNA microarrays technology has 
provided the field of gene regulation with a powerful tool for genome- 
wide characterization of transcription factor binding. This technique, 
termed ChlP-chip, relies on the immunoprecipitation of a transcription 
factor of interest with its associated DNA, followed by hybridization to a 
DNA microarray (Horak and Snyder, 2002). In addition, C-terminal pro- 
tein tagging with an exogenous well-defined epitope (e.g., Myc, HA) 
circumvents the need for raising native antibodies against every transcrip- 
tion factor (Janke et ah, 2004; Longtine et ah, 1998). The advantages of 
epitope tagging include the use of commercially available antibodies, the 
ability to tag multiple DNA-binding proteins in a high-throughput 
fashion and a lower occurrence of nonspecific immunoprecipitation and 
cross-reaction of chromatin, ultimately resulting in decreased noise. 

Novel high-throughput sequencing technologies, such as 454/Roche, 
Solexa/Illumina and ABI/SOLiD, have revolutionized genomic studies by 
allowing for large-scale sequence analysis through the generation of millions 
of short sequencing reads. For example, new transcripts and splice variants 
have been discovered in multiple organisms using RNA-Seq (Lister et ah, 
2008; Mortazavi et ah, 2008; Nagalakshmi et ah, 2008; Wilhelm et ah, 
2008). Transcription factor binding studies have also benefited from ultra- 
throughput sequencing via the development of ChlP-Sequencing (ChlP- 
Seq) (Johnson et ah, 2007; Robertson et ah, 2008). Instead of hybridizing 
the ChIP DNA sample to a microarray, each sample is processed directly 
into a DNA library for sequencing and analyzed separately after sequencing. 
The improved sensitivity and reduced background of ChlP-Seq is replacing 
the array-based ChlP-chip in mammalian studies aiming to characterize 
transcription factor binding. Typically, two to four times more transcription 
factor binding sites (TFBS) are determined using ChlP-Seq in comparison 
with ChlP-chip; the accuracy and resolution of the data are higher as well 
(Robertson et ah, 2007). ChlP-Seq studies have been used to characterize 
transcription factor binding during cell growth and a stress response 
(Johnson et ah, 2007; Robertson et ah, 2007), enabled the establishment 
of a regulatory network (Chen et ah, 2008) as well as helped to determine 
epigenetic changes (Marks et ah, 2009). ChlP-Seq is widely used by the 
ENCODE and modENCODE consortia for mapping TFBS in humans, 
C. elegans and D. melanogaster. In humans, it has been used to examine 
nucleosome positioning (Schones et ah, 2008) and, in yeast, our group has 
characterized the distribution of three DNA-binding proteins, Cse4p, RNA 
polymerase II, and Stel2p, using this procedure (Lefrancois et ah, 2009). 
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Recently, efforts have been made to develop a multiplexing scheme 
for Illumina sequencing, allowing many DNA samples to be sequenced 
simultaneously (Craig et ah, 2008; Cronn et ah, 2008; Lefrancois et ah, 
2009). As a typical flowcell lane currently yields approximately 8 or more 
million uniquely mapped sequence reads, the number of mapped reads far 
exceeds the minimal number required for mapping binding sites in yeast, 
flies, and worms (Lefrancois et ah, 2009; Zhong et ah, 2010). We have 
therefore developed a barcoded ChlP-Seq strategy that enables the accurate 
sequencing and analysis of multiple yeast ChIP samples in the same flowcell 
lane (Lefrancois et ah, 2009). Data generated in this fashion have identified 
binding sites for Stel2p and RNA PolII, and novel noncentromeric binding 
sites for Cse4p. We have also characterized the distribution of a reference 
sample, for example, input DNA. Input DNA, consisting of nonimmuno- 
precipitated, sonicated, cross-linked DNA, has great importance in ChlP- 
Seq studies as ChIP DNA samples are normally scored against it for TFBS 
identification (Auerbach et ah, 2009; Rozowsky et ah, 2009). The following 
protocols describe yeast ChlP-Seq, from ChIP to sequence data analysis. 
We also include our modifications to Illumina sequencing library prepara- 
tion for generation of barcoded DNA libraries or standard, nonbarcoded 
DNA libraries. 

Computationally, high-throughput sequencing involves handling and 
analysis of terabytes of sequencing data. Illumina sequencing is a four-color 
sequencing-by-synthesis approach where incorporation of a reversible ter- 
minator nucleotide generates a fluorescence signal detected by a high- 
sensitivity camera for A, C, G, and T during each cycle. The fluorescent 
dye is cleaved and the next base is incorporated. Typically, preliminary 
sequence data analyses are performed using built-in software supplied with 
the instrument. Fluorescent images of DNA clusters are first analyzed with a 
module called Firecrest to map cluster location while base-calling is per- 
formed with Bustard, which determines the probability of a given nucleo- 
tide using fluorescence intensities from the images. Finally, Gerald rapidly 
aligns 32 bases from the sequence reads to the reference genome using an 
algorithm called Eland, typically allowing for a maximum of two mis- 
matches. These selected parameters effectively map sequence reads back 
to the deeply sequenced yeast reference genome. For ChlP-Seq, determi- 
nation of binding sites from the sequence data is a challenge that has been 
tackled by different groups with various algorithms (Fejes et ah, 2008; Ji 
et ah, 2008; Johnson et ah, 2007; Jothi et ah, 2008; Nix et ah, 2008; 
Rozowsky et ah, 2009; Valouev et ah, 2008; Xu et ah, 2008; Zhang et ah, 
2008). ChlP-Seq analysis and the algorithms applied will be described in 
detail after the protocol section. Conceptually, sequencing reads (or tags) are 
compiled and genomic regions with an increased number of sequence tags 
compared to the tags from a control sample are considered as putative 
TFBS. Next, statistical filtering criteria are used to determine if these 
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putative sites represent true binding sites. After obtaining a preliminary set 
of TFBS, further bioinformatic analyses are necessary to further analyze the 
data. These may include analysis of the location of binding sites relative to 
nearby potential target genes, comparison with gene expression information 
and gene ontology (GO) analyses of potential targets. 




2. Protocols 

2.1. Chromatin immunoprecipitation 

DNA— protein complexes formed in vivo can be reversibly cross-linked 
through the application of formaldehyde, and specific DNA— protein inter- 
actions are isolated from covalently bound populations using an antibody 
specific to the transcription factor of interest. Figure 4.1 from Horak and 
Snyder (2002) summarizes the principal steps of ChlP. We suggest, when 
possible, tagging the transcription factor of interest with a Myc or HA 
epitope and performing the immunoprecipitation using commercial anti- 
bodies against this epitope; these antibodies generally give little background. 
As an experimental control, it is possible to IP an untagged version of the 
same strain and to follow the same protocol. The DNAs from the tagged 
and untagged strain can be used for qPCR enrichment analysis of selected 
binding sites prior to proceeding toward sequencing library generation. 
We adapted this protocol from Aparicio et al. (2004, 2005). 

(1) Grow 500 ml of yeast cells to exponential mid-log phase 
(OD 600 = 0.6—1.0). We suggest performing ChlP experiments in 
biological triplicates. 

(2) Treat cells with 14 ml 37% formaldehyde for 15 min, with occasional 
swirling every 5 min. This allows cross-linking of protein— DNA 
complexes. 

(3) Quench cross-linking reaction by adding 27 ml of 2.5 M glycine for 
10 min, with occasional swirling every 5 min. 

(4) Collect cells by filtration and wash cells twice with 100 ml of sterile 
Milli-Q (Millipore, Billerica, MA) water. Rinse the filter with 

2 X 20 ml of sterile Milli-Q water to collect cells in a 50 ml Falcon 
tube. Spin down cells at 4000 rpm for 10 min and discard supernatant. 
Resuspend the cells in 1 ml water and divide them equally in two 2 ml 
screw-cap tubes. Repeat this step. Spin down cells at top speed for 

3 min, remove the supernatant and put on ice. Measure cell weight. 
Add 1 ml zirconium beads. One can continue forward to cell lysis or 
freeze cells at —70 °C for long-term storage. 

(5) Resuspend cells in lysis/IP buffer (50 mM Hepes/KOH [pH 7.5], 
140 mM NaCl, 1 mMEDTA, 1% Triton X-100, and 0.1% sodium 
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Figure 4.1 Example of a multiplex ChlP-Seq workflow highlighting the principal steps of Illumina sequencing library generation. XXXT 
and YYYT represent different index sequences. Nonbarcoded ChlP-Seq can be performed by substituting the barcoded adapters with 
standard Illumina genomic DNA adapters. 
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deoxycholate) with 1 mM PMSF (Fluka, Buchs, Switzerland) and 
protease inhibitors (one tablet of Roche Complete protease inhibitor 
cocktail/50 ml lysis/IP buffer) and lyse them with zirconium beads 
using a FastPrep Machine (MP Biomedical, Irvine, CA; five times 
60 s at a speed of 6.0 m/s). 

(6) Recover lysates in a 5-ml snap-cap tube from one 2-ml screw-cap 
tube by centrifugation at 1500 rpm for 3 min, add 0.5 ml of lysis/IP 
buffer to the microfuge tube and centrifuge again. Pool the lysate from 
the other 2 ml screw-cap tube the same way. Add 1 ml of lysis/IP 
buffer prior to sonication. 

(7) Sonicate cell lysates using a Branson Digital 450 Sonifier (Branson, 
Danbury, CT) to shear DNA. Each sample was sonicated five times 
30 s with amplitude of 50%. Between each round of sonications, 
samples were put on ice for 2 min. The sonicated lysates should be 
clarified twice, first by a first centrifugation in a Sorvall centrifuge for 
5 min at 3000 rpm and then by a second centrifugation in Eppendorf 
microfuge at 14,000 rpm for 10 min. 

(8) Save 250 jA of clarified, sonicated lysate prior to immunoprecipitation 
to generate input DNA for Illumina sequencing (see next protocol). 

(9) Add 2 ml of lysis/IP buffer to each sample, the total volume should be 
around 6 ml. 

(10) Prewash the entire bottle of antibody-coupled beads using lysis/IP 
buffer. Remove the beads with a broadened 1 ml pipette and transfer 
to a 15 ml Falcon tube. Wash three times the bottle with 1 ml of fresh 
lysis/IP buffer to collect all the beads in the 15 ml tube. Vortex briefly 
and spin 2 min at 2000 rpm in a 4 °C centrifuge. Remove supernatant. 
Repeat three times with 4—5 ml fresh lysis/IP buffer. Resuspend the 
beads in an equal volume of lysis/ IP buffer (1 ml). For Myc- or HA- 
tagged strains, we use Sigma EZview anti-Myc affinity gel (Sigma, St. 
Louis, MO) and Sigma EZview anti-HA affinity gel (Sigma). One 
antibody bottle can be used for 12 samples. 

(11) Add 150—300 jA of prewashed beads to each sample. Immunoprecipi- 
tate overnight (12—16 h) on a rocker in the cold room. 

(12) After incubation, fill Falcon tube with fresh lysis/IP buffer, pellet 
antibody beads by spinning 5 min at 3000 rpm in a cold centrifuge 
and discard supernatant. 

(13) Wash the immunoprecipitated samples with 10 ml of appropriate 
buffer for 5—10 min on a rocker in the cold room. Between washes, 
spin down the beads in cold centrifuge at 2000 rpm for 2 min. Wash 
twice with lysis/IP buffer, once with IP/500 mMNaCl buffer (18 ml 
5 M NaCl added to 232 ml of lysis/IP buffer), twice with IP wash 
buffer (10 mM Tris-HCl, 0.25 M LiCl, 0.5% NP-40, 0.5% sodium 
deoxycholate, and 1 mM EDTA), and once with lx TE (50 mM 
Tris-HCl, 10 mM EDTA, pH 8.0). Following the last wash in TE, 
keep some buffer to transfer beads to a 1.5-ml tube. 
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(14) Transfer beads from the 15 ml Falcon tube to a 1.5-ml microcentri- 
fuge tube using a broadened 1 ml pipette. Transfer the remaining 
beads with an additional 0.5 ml of lx TE. Spin down the beads for 
3 min at 14,000 rpm and remove all TE. 

(15) Elute the immunoprecipitate from the beads by adding 100—150 fA 
lx TE/1% SDS and incubating for 15 min at 65 °C. After 10 min, 
mix samples briefly. Pellet the beads at 14,000 rpm for 1 min and 
transfer eluate to a new 1.5 ml tube. Add 150-200 fA of 1 X TE/0.67% 
SDS to the beads and incubate for 10 min at 65 °C. Pellet the beads 
and pool with previous eluate. Spin down at top speed for 2 min 
the pooled eluates to remove all beads, as their presence will reduce 
cross-linking reversal efficiency and therefore ChIP DNA recovery. 
Transfer eluates to a 2-ml screw-cap tube, avoiding the last 10 fA and 
the beads at the bottom of the tube. 

(16) Reverse protein— DNA cross-links by incubating at 65 °C overnight 
or for 6—8 h. 

(17) Treat samples with proteinase K to remove all proteins from samples. 
Dilute 20 mg/ml proteinase K (Ambion, Austin, TX) 50-fold in 1 X 
TE. Add 250 fA of diluted proteinase K solution per sample. Incubate 
between 37 and 50 °C for 2—4 h. 

(18) Precipitate DNA with ethanol. Add 3 fA of 20 mg/ml glycogen, 
2-5 fA of pellet paint (Novagen, San Diego, CA), 45 fA of 5 M 
LiCl, and 1 ml of 100% ethanol. Mix thoroughly and incubate at 
— 20 °C overnight or at least several hours. Put samples 1 h at — 70 °C. 
Spin in a cold centrifuge for 20 min at top speed and remove super- 
natant. The DNA pellet should be slightly pink due to pellet paint. 
Wash with 1 ml 70% ethanol for 5 min, spin in a cold centrifuge for 
10 min at top speed and remove supernatant. Air dry for 10 min. 
Resuspend in 100 fA lx TE. 

(19) Purify DNA using MinElute PCR purification kit (Qiagen, Valencia, 
CA). We recommend processing the ChIP DNA sample in two 
MinElute spin columns. Elution is done in 21 fA EB per column 
and the two eluates from the same sample are pooled. Samples are 
stored in a — 20 °C freezer. 

This procedure typically yields 100-300 ng of ChIP DNA. DNA con- 
centrations can be measured using a Nanodrop spectrophotometer 
(Thermo Scientific, Waltham, MA) or PicoGreen dsDNA quantification 
assay (Invitrogen, Carlsbad, CA). qPCR analysis should be performed prior 
to generation of sequencing library. We typically compare ChIP samples 
from a tagged strain (experimental sample) versus an untagged strain (con- 
trol sample) for enrichment in the experimental sample at three known 
binding sites and at a genomic locus where the transcription factor is not 
expected to bind (negative control). We have found that ChIP efficiency is 



Mapping Protein-DNA Interactions by ChlP-Seq 85 



the most critical step for success of the entire procedure. Here we suggest 
steps in the protocol for quality control as well as parameters that can be 
modified: 

(a) Formaldehyde cross -linking: Changing the concentration of formaldehyde 
and the duration of cross-linking can modify the extent of cross-linking. 
Too much cross-linking can mask the HA or Myc epitopes on tagged 
transcription factor while too little cross-linking will decrease the 
immunoprecipitation of the associated DNA. 

(b) Cell lysis: Five 1-min burst of FastPrep machine typically lyse over 95% 
of cells. Breaking cells using a paint shaker for 30 min yields about 
40% of lysed cells. 

(c) Sonication: Chromatin should be sheared to a median size of 450— 
500 base pairs (bp), as measured by gel electrophoresis in a 2% agarose 
gel. After step 7, take 250 fA of clarified lysate and add an equal volume 
of lx TE/1% SDS. Follow the aforementioned protocol from steps 
16 to 18, without purification through a MinElute spin column. Then 
load on a 2% agarose gel for electrophoresis. Ideally, a smear between 
100 and 1000 bp should be present, with a median size of 450— 500 bp 
(stronger smear intensity). 

(d) Antibody: Prior to performing a ChIP experiment with a new tagged 
strain, a Western blot should be performed to confirm the correct 
insertion of the epitope. Antibody quantities can also be optimized by 
preliminary IP experiments with various amounts of antibody. 



2.2. Input DNA preparation 

Input DNA serves as an important reference sample for ChlP-Seq experi- 
ments (Lefrancois et ah, 2009; Robertson et ah, 2007; Rozowsky et ah, 
2009). It is used during the scoring process where TFBS are determined 
based on the sequence reads obtained from a ChIP sample in comparison to 
input DNA. Input DNA consists of sonicated cross-linked chromatin that is 
processed in parallel to a ChIP sample, but lacking the immunoprecipitation 
step. Recent reports have suggested that input DNA represents breaks in 
chromatin regions of increased accessibility (Auerbach et ah, 2009; 
Teytelman et ah, 2009). However, there is currently a debate whether 
input DNA, normal IgG or, in the case of yeast, untagged strain should 
be the control sample for scoring ChlP-Seq data. Here we present our 
laboratory's protocol for isolation of input DNA, which starts at step 8 of the 
previous ChIP protocol. 

(20) Combine 250 {A of 1 X TE/1% SDS to the reserved 250 }A of clarified 
sonicated lysate (from step 8, ChIP protocol) in a 2-ml screw-cap tube. 

(21) Reverse cross-links overnight by incubating at 65 °C. 
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(22) Treat samples with proteinase K as described (step 17, ChlP protocol). 

(23) Extract input DNA three times with phenol: chloroform: isoamyl alco- 
hol (25:24:1) (Fluka) followed by a single extraction with chloroform 
alone. In each case, keep the upper aqueous phase. 

(24) Precipitate DNA with ethanol by adding 50 jA of 5 MLiCl and 1 ml of 
100% ethanol to the upper aqueous phase from the last chloroform 
extraction. Enhance precipitation by transferring at —20 °C for 1 h. 
Centrifuge samples at top speed for 20 min and discard supernatant. 
Wash with 1 ml of 70% ethanol in the cold room for 5 min, spin down 
DNA at top speed for 10 min, discard ethanol and air dry for 10 min. 
Resuspend DNA in 1 x TE (pH 8.0). 

(25) RNase-treat the input DNA sample. Add 2 jA of 10 mg/ml DNase- 
free RNase A (Roche, Indianapolis, IN) and incubate for 30 min at 
37 °C. 

(26) Purify DNA using a MinElute PCR purification column (Qiagen). 
Elution is done in 21 jA of EB. 

The amount of input DNA recovered using this procedure is much 
greater than that of a ChlP sample. We recommend the use of one-fifth of 
each input DNA sample for Illumina sequencing library preparation. If the 
upper phase seems unclear and the interphase is still very cloudy after the 
three phenol: chloroform: isoamyl alcohol extractions, an additional extrac- 
tion should be performed. The phase-lock gel system (5 Prime, Gaithers- 
burg, MD) can be used to perform safer extractions with higher recovery of 
DNA due to the organic phase and the interphase being sequestered 
physically at the bottom. This facilitates the removal of the upper, aqueous 
phase containing DNA. 



2.3. Illumina sequencing DNA library generation 

ChlP samples must be converted into DNA libraries for sequencing. 
Protocols differ depending of the sequencing platform used; 454/Roche, 
Solexa/ Illumina, SOLiD/ABI, and Helicos each use different strategies to 
create a library representing the population of short DNA fragments 
selected by ChlP. Analysis of TFBS by sequencing technologies does not 
require very long sequencing reads; large numbers of short reads (e.g., 
35 bp) are sufficient for mapping binding sites in most organisms. Therefore, 
Illumina/Solexa and ABI/SOLiD have been favored over Roche/454 
because they both generate millions of very short reads (about 35 bases/ 
read) whereas Roche/454 generate less reads but of longer length (200—300 
bases/read). Currently, most ChlP-Seq studies have been performed on the 
Illumina platform and a few have used SOLiD. Here we describe our 
procedure to generate standard, nonbarcoded Illumina libraries. We have 
optimized the ChlP-Seq protocol used in mammalian cell lines experiments 
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(Robertson et ah, 2007) to the yeast context, which follows the manufac- 
turer's guidelines. During Illumina library generation, oligonucleotide 
adapters are introduced at the ends of the small ChIP DNA fragments that 
were bound previously by the transcription factor of interest. These adapters 
allow hybridization of the sample to a flowcell containing a lawn of primers 
which is used for subsequent cluster generation and sequencing-by- 
synthesis. 

During Illumina library preparation, the sheared ChIP DNA is end- 
repaired. A single adenosine base ("A") is added to the 3' end of both strand 
followed by annealing and ligation to the double-stranded adapter contain- 
ing a "T" overhang. A short PCR amplification (15—17 cycles) with 
primers annealing to the adapter sequence is performed to generate a 
population of adapter-ChIP DNA fragments termed the library. Size selec- 
tion on a 2% agarose gel allows isolation of the amplified DNA library 
between 150 and 350 bp. This is the optimal range of fragment size for 
hybridization to the flowcell and cluster generation according to Illumina' s 
recommendations. 

According to bioinformatic simulations based on the yeast genome, only 
260,000 uniquely mapped reads would be sufficient to determine at least 
95% of the TFBS from a typical punctual TF if these binding sites are 
enriched at least fivefold in the ChIP sample (Lefrancois et ah, 2009). This is 
very low when compared to the human genome, where 12 M mapped 
reads is usually used (Rozowsky et ah, 2009). A single Illumina flowcell lane 
generates about 8 M mapped reads so multiple yeast ChlP-Seq samples 
can be sequenced simultaneously using multiplex Illumina sequencing 
(Lefrancois et ah, 2009). As shown in Fig. 4.1, to generate barcoded 
Illumina libraries, one can substitute Illumina's genomic DNA adapters 
for custom-made adapters that contain the adapter sequence from Illumina 
genomic DNA adapter followed by a nucleotide tag of at least two bases 
(called the barcode or index; we usually use three bases) and terminated by a 
single "T" for annealing and ligation to the end-repaired DNA containing 
an "A" overhang (Craig et ah, 2008; Cronn et ah, 2008; Lefrancois et ah, 
2009). Standard Illumina genomic DNA PCR primers are used and the rest 
of the procedure is intact. ABI/SOLiD has established an indexing strategy 
since the commercial launch of their platform. 

(27) Perform gel electrophoresis on a 2% agarose gel with at least 100 ng of 
ChIP DNA from step 19 (or input DNA) and size select the DNA 
smear between 100 and 700 bp. For ChIP, we usually use between 
15 and 35 jA of MinElute-purified DNA from step 19. For input, due 
to its higher DNA concentration, one can apply a lower volume of 
MinElute-purified DNA from step 26 on the gel (5—10 fil) or gel- 
purify the same volume as for ChIP but use only 20—25% of the 
gel-purified input DNA for the next steps. A 100-bp DNA ladder 
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should be included during gel electrophoresis. Samples should not 
migrate too much on the agarose gel to allow for isolation of the 100— 
700 bp smear in a relatively small gel volume. Samples are run typically 
~20 min at 100—110 V. The Qiagen QIAquick gel extraction kit is 
used (Qiagen) and elution is done in 34 fA EB. Although this size 
selection step is optional, we recommend it for exclusion of very short 
fragments and longer fragments which are not suitable for Illumina 
sequencing. For input DNA, the intensity of the smear should be high 
while for ChIP DNA, the smear should be visible although much 
fainter. 

(28) End-repair DNA for 45 min at room temperature using End-It DNA 
end-repair kit (Epicentre, Madison, WI). DNA fragments are blunted 
by end-repair and all 5' ends are phosphorylated. DNA is purified after 
end-repair using a QIAquick PCR purification column (Qiagen) and 
eluted in 34 fA EB. 

(29) Add a single adenosine nucleotide ("A") to the 3' blunted ends of 
end-repaired DNA fragments (in 34 fA EB). Perform a reaction on 
eluted sample from step 28 with 10 fA 1 mM dATP, 5 fA 10 X NEB 
buffer 2 and 1 fA Klenow fragment (3 7 — > 5 7 exo minus) (NEB, 
Ipswich, MA). Mix all components in a PCR plate and cover with a 
sealing microfilm. Reaction is performed at 37 °C for 30 min in a 
PCR machine, without the use of a heated lid. Aliquots of 1 mM 
dATP should be prepared from a 100-mM dATP stock solution 
(Invitrogen) and frozen at —20 °C. Freeze— thaw should be avoided. 
The low concentration of dATP permits the single addition of an 
"A." A MinElute PCR purification column (Qiagen) is used to purify 
the reaction and DNA is eluted in 10 jA EB. 

(30) Ligate Illumina genomic DNA adapters (Illumina, San Diego, CA) or 
barcoded adapters to the sample for 15 min at room temperature. Mix 
10 fA of sample from step 29, 1 fA of diluted oligonucleotide adapters, 
1.5 fA of LigaFast T4 DNA Ligase (3 units//il; Promega, Madison, 
WI), and 12.5 fA of Rapid Ligation Buffer (Promega). The dilution of 
Illumina genomic DNA adapters depends of the nature of the sample. 
For input DNA, Illumina nonbarcoded genomic DNA adapters are 
diluted 1:20 with Gibco RNase-free, DNase-free water (Invitrogen); 
for ChIP DNA, Illumina adapters are diluted 1:40. After the 15 min 
reaction, ligation products are purified with a MinElute PCR purifi- 
cation column and eluted in 10 fA EB. These adapters contain an 
unpaired "T" overhang which anneals to the 3 7 "A" on the sample 
DNA. Barcoded adapters must have been annealed before being 
added to the end-repaired DNA. The concentration of diluted bar- 
coded adapters for ligation to input DNA or ChIP DNA samples 
should mimic that of standard genomic DNA adapters. Barcoded 
adapter design and annealing will be described in the next section. 
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(31) Perform a gel electrophoresis on a 2% agarose gel and size select the 
DNA smear between 150 and 500 bp. We have found that this gel 
purification prior to PCR amplification has increased the quality of 
the sequencing libraries. More importantly, it decreases the intensity 
and occurrence of adapter— adapter dimerization after the PCR ampli- 
fication. Adapter— adapter dimers amplify preferentially during the 
following PCR step and appear as a compact bright band around 
100—120 bp. This compact band can totally or partially replace the 
normal smear indicative of a successful library. For this reason, at this 
step, DNA fragments below 150 bp should be excluded. We recom- 
mend using a 2% agarose E-Gel to separate adequately samples during 
loading and migration. Load 20 /A of a 1:10 diluted Track-It 50 bp 
DNA ladder (Invitrogen). Add 3 /A of a 1:10 diluted Track-It Cyan/ 
Orange loading buffer (Invitrogen) to each sample. Samples should be 
separated by at least two empty wells. Load 20 jA of Gibco RNase- 
free, DNase-free water to all empty wells. Perform gel electrophoresis 
for 20 min. Recover DNA using the QIAquick gel extraction kit 
(Qiagen) and elute ligated samples in 28 jA EB. At this step, input 
DNA libraries should be visible but rather faint while ChIP DNA 
libraries are fainter than input DNA ones and even sometimes cannot 
be seen. The lack of a visible ChIP DNA smear at this step does not 
prevent generation of successful and high-quality libraries. 

(32) Amplify the sequencing library by PCR using Illumina genomic 
DNA primers 1.1 and 2.1. In a PCR plate, mix 28 jA of eluted 
DNA sample from step 32, 1 /il of 1:1 diluted Illumina genomic 
DNA primer 1.1, 1 /il of 1:1 diluted Illumina genomic DNA primer 
2.1, and 30 jA of Phusion Master Mix with HF Buffer (NEB). Use the 
following PCR settings with a heated lid: denaturation at 98 °C for 
30 s, 17 cycles of amplification (10 s at 98 °C, 30 s at 65 °C, and 30 s 
at 72 °C), an extra amplification at 72 °C for 5 min and a cool 
down to 4 °C. Remove enzymes and buffer using a MinElute PCR 
purification column (Qiagen) and elute in 10 jA EB. 

(33) Size select the Illumina sequencing library between 150 and 350 bp by 
gel electrophoresis on a 2% agarose gel. These size specifications meet 
the manufacturer's guidelines for cluster generation, optimal at a 
median fragment size of about 230 bp. The use of a 2% agarose 
E-Gel is preferable. Loading of samples and ladder are identical to 
step 31. Run gel electrophoresis for 20 min. A picture of the final 
library on the gel should be taken. At this step, a medium- to-high 
intensity smear over 150 bp and under 500 bp should be easy to 
visualize, suggesting the sequencing library preparation was successful. 
If there is a faint well-defined band at around 100—120 bp, extreme 
care should be taken during gel excision to avoid completely this 
adapter— adapter dimer band. The presence of adapter— adapter dimers 
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during sequencing will greatly decrease the overall mappability of 
sequencing reads. The number of uniquely mapping reads could 
then be very low. Gel extraction is done using a MinElute gel 
extraction kit (Qiagen) and elution is done in 20—25 fA EB. 

(34) Measure DNA concentration and the Abs 2 6o nm/280 nm ratio using a 
Nanodrop spectrophotometer (Thermo Scientific). Good quality 
libraries have an A260/280 ratio between 1.7 and 2.0. Lower values 
indicate poor quality. The minimal DNA concentration to proceed 
toward Illumina sequencing is 5.0 ng//il. Libraries with lower DNA 
concentrations should be discarded. We typically obtain DNA con- 
centrations over 8.0 ng//il for ChIP DNA libraries and over 15.0 ng//i 
1 for input DNA libraries. 

(35) Store Illumina sequencing libraries at — 70 °C until they are processed 
for sequencing. 

Samples are now ready for the sequencing step of the ChlP-Seq proce- 
dure. They are compatible with Illumina Genome Analyzer and Genome 
Analyzer II. Generation of barcoded libraries follows an identical procedure 
except barcoded adapters are added at step 30 instead of Illumina genomic 
DNA adapters. Prior to the sequencing of barcoded DNA libraries, they 
must be mixed together in an equimolar ratio using DNA concentrations 
obtained from Nanodrop. A more precise method to measure DNA con- 
centrations such as the Pico Green dsDNA quantification assay (Invitrogen) 
could also be used, although we have obtained good barcode representation 
with Nanodrop concentrations (less than twofold difference in the number 
of mapped reads between the least abundant and the most abundant 
barcoded sample). Here are a few considerations for Illumina sequencing 
DNA library generation: 

(e) ChIP efficiency: An insufficient amount of starting DNA material (in this 
case, ChIP DNA) is the most important cause of failure in library 
preparation as noted by the complete absence of a smear on the final 
agarose gel in step 33, the presence of a single intense adapter— adapter 
dimer band at 100—120 bp or the cooccurrence of a strong adapter- 
adapter dimer band and of a very faint library smear. Scaling up the 
ChIP protocol is a solution to generate more DNA as well as the use of 
tagged strains and/or of ChlP-grade antibodies. 

(f ) Adapter dilution: It is crucial to dilute barcoded adapters to the working 
concentration of the diluted standard Illumina genomic DNA adapters. 
If the concentration of barcoded adapters is too high, it may favor the 
ligation of adapter to other adapters, resulting in the formation of a 
strong adapter— adapter dimer band during the final gel extraction (step 
33) or in the absence of a DNA smear indicative of a successful library. 
Optimization of concentrations should be first performed on input 
DNA. Similarly, if problems occur using Illumina genomic DNA 
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adapters, optimization should also be performed on input DNA and 
then on ChIP DNA. 
(g) PCR amplification: PCR amplification using Illumina genomic DNA 
primers 1.1 and 2.1 should stay in the linear range to avoid overrepre- 
sentation of some genomic areas among the sequencing library. 
Sequencing reads would then be very high for these overrepresented 
regions, creating a bias during data analysis. The manufacturer recom- 
mends no more than 18 PCR cycles. One can perform less cycles. The 
common range for PCR amplification of Illumina DNA libraries 
lies between 13 and 18 cycles. 

2.4. Barcode design and adapter annealing 

This section applies specifically to barcoded ChlP-Seq on an Illumina 
platform. Multiplex sequencing-by-synthesis has been accomplished 
through the introduction of indexed (or barcoded) adapters (Craig et ah, 
2008; Cronn et ah, 2008; Lefrancois et ah, 2009). These strategies have 
allowed multiplex sequencing and analysis of HapMap loci from different 
individuals (Craig et ah, 2008), chloroplast genomes from different species 
(Cronn et ah, 2008), and yeast ChIP samples (Lefrancois et ah, 2009), 
without the introduction of barcode-induced errors or artifacts. In all 
cases, a barcode was introduced after the Illumina adapter sequence required 
for PCR amplification and hybridization to Illumina' s flowcell. The result- 
ing sequencing reads first contain the index followed by the sequenced 
DNA sample. The barcode must contain a final "T" for pairing and ligation 
to the end-repaired DNA with an "A" overhang. We have used four 
indexes for barcoded ChlP-Seq: ACGT, CATT, GTAT, and TGCT. 
We have created a three base index where no barcode contained the same 
base at each position mainly for two reasons. First, these barcodes have a 
balanced nucleotide composition in compliance with manufacturer's guide- 
lines. Second, one- or two-base sequencing errors would not result in a 
barcode being assigned to an erroneous sample as the remaining index base 
would not match another sample. In our work, the barcode must be intact in 
all sequencing reads assigned to a sample. With the new Illumina Genome 
Analyzer II, the increase in read length and the decrease in error rates at later 
sequenced bases permit generation of longer barcodes. This, coupled to an 
expected increased number of sequencing reads, could significantly increase 
the level of multiplexing in yeast ChlP-Seq studies as well as ChlP-Seq 
studies in small genome organisms. Here we present the procedure for 
oligonucleotide design and annealing to generate barcoded adapters. 

(36) Synthesize oligonucleotides at a 0.05 /imol scale with HPLC purifica- 
tion from MWG/Operon (Eurofins MWG Operon, Huntsville, AL). 
Oligonucleotide sequences are given in Table 4.1. Note that the 
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Table 4.1 Oligonucleotide sequences for barcoded ChlP-Seq 



Barcode 


Forward/reverse 


Sequence (5' — ► 3') a 


ACGT 


Forward 


ACACTCTTTCCCTACACGACGCTC 
TTCCGATCTACGT 




Reverse^ 


CGTAGATCGGAAGAGCTCGTATG 
CCGTCTTCTGCTTG 


CATT 


Forward 


ACACTCTTTCCCTACACGACGCTC 
TTCCGATCTCATT 




Reverse^ 


ATGAGATCGGAAGAGCTCGTATGC 
CGTCTTCTGCTTG 


GTAT 


Forward 


ACACTCTTTCCCTACACGACGCTC 
TTCCGATCTGTAT 




Reverse^ 


TACAGATCGGAAGAGCTCGTATGC 
CGTCTTCTGCTTG 


TGCT 


Forward 


ACACTCTTTCCCTACACGACGCTC 
TTCCGATCTTGCT 




Reverse^ 


GCAAGATCGGAAGAGCTCGTATGC 
CGTCTTCTGCTTG 



From Lefrancois et al. (2009). 

No modification. 

5' ends are phosphorylated. 



forward primer contains the index and the final "T" at the 3 7 end 
while the reverse primer is phosphorylated at the 5' end and the 
reverse-complement index sequence is found at the 5' end. 

(37) Resuspend each primer in annealing buffer (10 mM Tris [pH 7.5], 
50 mM NaCl, 1 mM EDTA) to 200 fiM. 

(38) Mix the forward and reverse primers for each index pair in equal 
volumes to a final concentration of 100 fiM. 

(39) Heat to denature in a wet heat block at 95 °C for 5 min. 

(40) Remove heat block to room temperature and let primers cool down 
during 45 min to promote annealing. 

(41) Keep on ice for a few minutes and store barcoded adapters at — 20 °C. 

(42) Dilute barcoded adapters with Gibco RNase-free, DNase-free water 
(Invitrogen) to the working concentrations of Illumina genomic 
DNA adapters for generation of input DNA and ChIP DNA libraries 
(previous protocol). Annealed indexed adapters have different con- 
centrations from each other and differ in the dilutions to obtain the 
adequate working concentrations. As an example, with our barcoded 
adapters given in Table 4.1, we have diluted all four adapters 1:30 to 
generate barcoded input DNA libraries while we have diluted differ- 
ently adapters for barcoded ChIP DNA libraries: 1:750 for ACGT, 
1:450 for CATT, 1:500 for GTAT, and 1:330 for TGCT. 
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2.5. Illumina sequencing 

We follow manufacturer's protocols and guidelines. Detailed protocols for 
operating the cluster station and sequencing using Genome Analyzer II are 
available from Illumina' s web site. Here we will only briefly describe the 
various steps of Illumina sequencing. First step is cluster generation on 
Illumina's cluster station. The cluster station uses microfluidics to physically 
attach DNA from the sequencing library (step 34) onto one lane of an 
Illumina flowcell. An Illumina flowcell contains eight lanes and each lane 
has a lawn of primers with a sequence corresponding to the complement of 
the Illumina adapter sequence. Samples are denatured and each single- 
stranded ChIP DNA fragment is connected to the lawn primer via one 
adapter end. A solid-phase bridge amplification replicates the template 
DNA fragment from the paired adapter— primer. After denaturation of this 
double-stranded bridge of DNA, the initial template DNA is washed away 
and the flowcell-attached replica of the template can undergo successive 
rounds of bridge amplification to generate a cluster. A cluster contains 
about 1000 copies from an identical initial template. There are typically 
100—120,000 clusters on a single tile, with 100 tiles per flowcell lane. This 
can give rise to 10—12 M reads per lane. DNA loaded on the flowcell should 
be at a concentration between 3 and 5 pM and optimal library size should 
be between 150 and 350 bp. If the DNA library is smaller, too many clusters 
of smaller size will be present due to the lesser reach of bridge amplification, 
giving rise to a fewer number of clusters passing quality metrics and a 
lower number of mapped reads. On the other hand, if the smear size of the 
library is bigger, fewer clusters of bigger size will be generated due to the 
greater reach of bridge amplification, resulting to a decreased number of 
clusters and mapped reads. Just prior to sequencing, a sequencing primer is 
annealed. The flowcell is then transferred from the cluster station to the 
Genome Analyzer II for sequencing of DNA clusters. Illumina employs a 
four-color sequencing-by-synthesis method. Fluorescently labeled reversible 
terminator ddNTPs are added simultaneously and one base is incorporated 
per cluster. Laser excitation and fluorescence allows the detection of the first 
base. The fluorescent dye is cleaved and the first base is unblocked to add the 
second base using the same reagents. This process is continued for 34—36 
cycles by sequencing one base at a time. Each read starts with the template 
DNA sequence or, in the case of barcoded ChlP-Seq, with the 4-bp barcode. 
The following section focuses on sequencing reads analysis. 




3. Sequencing Data Management 

Illumina uses a massively parallel sequencing-by-synthesis approach. 
A typical run on the Illumina instrument lasts 2—3 days, and generates at 
least 1 terabyte of data, which poses a big challenge on data storage system 
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and data transfer method. Analyzing these raw image data to get biolog- 
ically meaningful sequences is also a computationally intensive task. We use 
Genome Analysis Pipeline (GAP) software (Illumina) to analyze the 
sequencing data. The minimum system requirement for running this soft- 
ware is a dual-processor, dual-core computer. If multiprocessor facilities are 
available, the data analysis time can be greatly reduced by parallelization. 
The outputs produced by GAP are stored in a hierarchical directory struc- 
ture called the "run folder." Currently our run-folder resides on the Yale 
biomedical high-performance computing cluster, which consists of 170 
Dell PowerEdge 1955 nodes, and each node contains two dual core 
3.0 GHz EM64T Intel CPUs and 16 GB RAM. 

It is very important to have an IT infrastructure with sufficient compu- 
tation capacity, data storage and transfer abilities to support Illumina 
Genome Analyzer. Depending on the scale of sequencing runs, a laboratory 
can also consider commercially available Laboratory Information Manage- 
ment System (LIMS), such as WikiLIMS (BioTeam). 



4. Genome Analysis Pipeline 

Since detailed instructions of installing and running the GAP are 
available from Illumina, we will only briefly introduce the functionality of 
pipeline modules related to ChlP-Seq data analysis. Users can refer to the 
GAP documentation for more details. The documentation files can be 
obtained with Genome Analyzer machine setup, or browsed through 
publicly accessible domains. The link to the documentation files at Yale 
University is http://sysgl.cs. yale.edu:3443/pDir/GAP-1.1.0-docs/. 

There are three main modules in the GAP. The first module Firecrest is an 
image-analysis module. The images are generated from sequencing-by-syn- 
thesis at hundreds of thousands of clusters. At each cluster, the sequencing 
machine records four images of added nucleotides (A, G, C, or T) at each 
synthesis cycle. Firecrest analyzes images captured by the sequencing 
machine, and remaps cluster positions. In an updated version, Illumina 
introduced the Integrated Primary Analysis and Reporting (IPAR) Software, 
which processes images and performs quality control in real time. IPAR 
removes the need of storing raw images. The second module, Bustard, per- 
forms base-calling. From the four images captured for A, G, C, and T at each 
round of synthesis, Bustard calculates the occurrence probability of a certain 
nucleotide at each cluster, and after 30—34 cycles of synthesis, it concatenates a 
chain of nucleotides with highest occurrence probabilities into a short 
sequence tag with length equal to the number of synthesis cycles. It has also 
some built-in quality control mechanism to determine the confidence of its 
base-calling. The third module, Gerald, aligns short sequence reads to a 
reference genome. The alignment software (Eland) in the Gerald module 
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runs very fast and accurately aligns sequence tags of less than 32 bp to a 
reference genome. Several other open-source programs can also align a large 
number of short sequences, such as SOAP (Zhang et ah, 2008) and MAQ (Li 
et ah, 2008). SOAP has a unique feature of aligning sequences across small 
gaps in the genome, which is helpful in sequencing transcrip tomes. MAQ has 
a dedicated module to call SNPs and de novo genome assembly. 




5. Examining Data Quality and 
Parsing Barcode d Data 

Before we can use the ChlP-Seq data to answer biological questions, we 
need to make sure the data quality is of sufficient quality. There are several 
summary statistics to examine after a sequencing run: % Error (multiplexing 
runs usually have higher % error than nonbarcoded runs, around 5%), % 
Phasing (< 1%), total reads (GAII can reach 12—14 million) and cluster density 
(~ 100,000). Some other statistics to verify alignment percentage of short 
sequences to the genome are Total No Match, % No Match, Total QC Fail, 
% QC Fail, R0 Multiple Match, Rl Multiple Match, R2 Multiple Match, 
Total Multiple Match, % Multiple Match, U0 Unique Match, Ul Unique 
Match, U2 Unique Match, Total Unique Match, and % Unique Match. 

For multiplexing runs, users need to parse the Eland query file by 
matching the first several nucleotides with the barcode sequences, remove 
the barcode sequences from the sequencing reads, and rerun Eland sepa- 
rately for each parsed data set. In our case, the barcodes are GTAT, CATT, 
ACGT, TGCT (Lefrancois et ah, 2009). Users do not need to check 
alignment statistics for the entire lane because the alignment uses sequence 
tags with barcodes at the 5 7 end. Instead, users should check these statistics 
for parsed data with removed barcode sequence. A typical run of yeast 
multiplexing sequencing with four barcodes has ~15% Total Multiple 
Match, and ~ 60% Total Unique Match. 

Some sample Perl scripts to perform the barcode parsing and Eland rerun- 
ning tasks can be found at http://pantheon.yale.edu/~wz4/Homepage.html. 
Since the scripts call the Eland program, they need to be run on the same 
server as the GAP resides, and modified according to user-specific directory 
structures. If automatic barcode parsing and Eland alignment are desired, 
please consult with IT support to integrate these functions into the GAP. 



6. Visualization in Genome Browser 

After aligning short sequencing tags onto a reference genome, we can 
load the data into a Genome Browser to directly visualize how the short tags 
are distributed across the genome, and whether there is any enrichment near 
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the regions of interest. There are many versions of Web-based genome 
browsers available for the yeast community, including Gbrowse, UCSC 
Genome Browser. One can upload data to the server in a format that can be 
recognized by the genome browser, and visualize the signal track along with 
other annotations on the host webpage of the specific genome browser. 
Here we will provide a step-by-step guide to visualize ChlP-Seq data on 
a local machine using Integrated Genome Browser (IGB) developed by 
Affymetrix. First, download and launch IGB following the instructions 
at: http://www.affynietrix.com/partners_programs/programs/developer/ 
tools/ do wnload_igb.affx. 

To load S. cerevisiae annotations, click File — > Access DAS/1 Servers; 
in the pop-up window named "DAS/1 Feature Loader," choose "UCSC" 
in "DAS Server" pull-down menu, "sacCerl" in "Data Source" pull- 
down menu, and "1 (or any other chromosome you want to see)" in 
"Sequence" pull-down menu, and check any interested annotations 
in the "Available Annotations" window. 

To load ChlP-Seq data into IGB, one needs to transform the Eland 
results into a format that can be recognized by IGB. A sample Perl script for 
this purpose can be found at http://pantheon.yale.edu/~wz4/Homepage. 
html. To run this script, simply type the following command in a command 
line shell: 

perl create_sgr_file.pl "eland_result_folder" 

The "eland_result_folder" is the folder containing Eland results files 
parsed into chromosomes, with names "eland_results_chr*.txt." 

This script transforms Eland results file into .sgr format, which is com- 
patible with Affymetrix's IGB. The format for each line of an sgr file is: 

Chromosome Start_Position Score 

where "Score" is the number of overlapping ChIP fragments from 
the current Start_Position to 1 bp upstream of the next Start_Position. 
Before counting overlapping ChIP fragments for each genomic position, 
create_sgr_file.pl also extends sequencing tags in the 3' direction to 200 bp 
since Illumina sequencing tags only represent one end of ChIP fragments, 
and the average ChIP fragment size in the sequencing library is 200 bp. 

6.1. Low-level analysis 

The Genome Browser can help scientists visualize and roughly determine 
sequence-tag enriched regions. To precisely identify TFBS across the 
genome, we need more rigorous peak-scoring algorithm. Since the emer- 
gence of ChlP-Seq technology, a bunch of peak-scoring algorithms have 
been developed for mammalian genomes, and most of them can also be used 
to analyze yeast ChlP-Seq data. Here we briefly describe the basic flowchart 
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of peak-scoring algorithms, and summarize the major features of several 
popular peak-scoring algorithms in Table 4.2. A peak-scoring algorithm 
usually compares sequencing data from a ChIP experiment with simulated 
background, or sequencing data from a control experiment (nontagged 
strain, IgG ChIP, or input DNA). To determine regions with enriched 
sequence-tag distribution, the scoring algorithm normalizes the ChlP-Seq 
data and control sequencing data to the same scale, and then uses proper 
statistical tests (e.g., Binomial, Poisson, Normal) to compare distributions of 
sequence tags in the two data sets. Significance level is adjusted to control 
the false discovery rate. 

Because the Illumina sequencing platform only reads 30—32 bp from one 
end of the ChIP DNA fragment, the most enriched regions (peak centers) of 
sequencing tags may not overlap exactly with the most enriched regions 
(peak centers) of ChIP DNA fragments, the latter ones corresponding to 
potential binding sites meaningful to biologists. Several different methods 
were proposed to convert sequencing tag position to peak center position in 
published peak-scoring algorithms. The most straightforward way is to 
extend the sequencing tags to the length of original ChIP DNA fragment, 
which is about 200 bp due to size selection on agarose gel in sequencing 
library construction (Rozowsky et ah, 2009; Xu et ah, 2008). More sophis- 
ticated methods include estimating the length of original ChIP DNA 
fragments with triangle or bell shaped distribution centered at ~200 bp 
(Fejes et ah, 2008), or separating sequencing reads aligned onto Watson and 
Crick strands, and using the distances between peak center on Watson 
strand and peak center on Crick strand to estimate the length of original 
ChIP DNA fragments (Ji et ah, 2008; Jothi et ah, 2008; Zhang et ah, 2008). 

The biggest distinction among existing peak-scoring algorithms is the 
method of extracting background from ChlP-Seq data. Due to the high cost 
of ChlP-Seq procedure, earlier peak-scoring algorithms often considered 
one-sample analysis, which compares ChIP data with a null background 
generated from random permutation or estimated from a Poisson model. 
One-parameter Poisson model (Feng et ah, 2008; Marson et ah, 2008; 
Robertson et ah, 2007) has been widely used in these peak-scoring algo- 
rithms. Another popular method to estimate background is Monte Carlo 
sampling (Bhinge et ah, 2007; Chen et ah, 2008; Fejes et ah, 2008; Johnson 
et ah, 2007; Mikkelsen et ah, 2007; Robertson et ah, 2007; Zhang et ah, 
2008). Later studies found out that Poisson model with a fixed X is not good 
enough to describe nonrandom fluctuations as observed in the input con- 
trol. To alleviate this problem, CisGenome (Ji et ah, 2008) used Negative 
Binomial instead of Poisson to model the background. MACS (Zhang et ah, 
2008) used dynamic Poisson parameters. Both studies recognized that the 
random sampling process had different sampling rates at different positions 
in the genome, and tried to capture the nature of changing parameters in the 
underlying Poisson model. Nonetheless, it becomes clear that two-sample 
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analysis is superior because in certain genomic regions, the sequencing tag 
distribution in an input control experiment shows nonrandom enrichment. 
Sometimes the same enrichment pattern is also observed in ChIP experi- 
ment (Nix et ah, 2008; Rozowsky et ah, 2009; Zhang et ah, 2008). Such 
enrichment is not likely to be caused by TFBS; instead it probably represents 
fluctuations due to systematic biases. There are many sources of systematic 
biases. Known sources include technical reasons such as the method of DNA 
fragmentation, biased amplification in PCR, error in the sequencing and/or 
the alignment processes; biological reasons such as the degree of genome 
repetitiveness, open chromatin structure; statistical reasons such as the 
dependency among observations from neighboring positions on a chromo- 
some. In both one- and two-sample analyses, it is assumed that the number 
of sequencing tags observed in a small window of the genome comes from 
random sampling process. Binomial or Poisson model are often used in two- 
sample analyses to compare the number of reads in windows of two samples 
(Feng et ah, 2008; Ji et ah, 2008; Jothi et ah, 2008; Nix et ah, 2008; 
Rozowsky et ah, 2009; Valouev et ah, 2008; Zhang et ah, 2008). 



6.2. High-level analysis 

Once TFBS are identified from the ChlP-Seq data, one can carry out more 
high-level analysis to answer biological questions, such as motif analysis, 
association of TFBS with neighboring genes, and comparison with ChlP- 
chip data. One can also study the positions of TFBS relative to genome 
annotation features, such as intragenic versus intergenic binding and binding 
in 5' or 3' untranslated regions. These analyses are all implemented in an 
integrative open-source software, CisGenome (Ji et ah, 2008). It has many 
functions varying from low-level analysis to high-level analysis, and its 
graphic interface under Windows OS is user-friendly for bench scientists. 
It can be downloaded from the following web site: http://www.biostat. 
jhsph.edu/~hji/cisgenome/. 



6.3. Troubleshooting 

If the sequencing run yields many reads, but the percentage of matched 
reads after alignment is low, a possible explanation for this phenomenon is 
sample overloading. If too much DNA is loaded onto the flowcell, there 
will not be enough separation between neighboring clusters and base- 
calling error rates will be high. By checking the summary statistics "cluster 
density" and "% Error," one can find deviations from optimal values, and 
adjust sample concentration accordingly. If summary statistics for sequenc- 
ing runs are adequate, one still needs to search for technical problems in 
ChIP and library construction procedures. 
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A common problem in IGB visualization is that ChlP-Seq data is shown 
in a separate window from other annotations. The reason is that Eland result 
files sometimes have different chromosome naming system (e.g., chrOl, 
chr02, . . ., chrmt) from that in IGB (e.g., chrl, chr2, . . ., chrM). In this 
case, one needs to rename all the chromosomes in the IGB system to 
visualize ChlP-Seq data along with other annotations in the same window. 
If there is not enough memory to load data into IGB, one can load data for 
one chromosome at a time. 




7. Conclusion and Future Directions 

ChlP-Seq has emerged as a highly sensitive and cost-effective method 
for genome-wide mapping of TFBS at a high resolution. Barcoded ChlP- 
Seq enables multiplex short-read sequencing and offers a higher throughput 
and lower cost per sample. An ongoing debate in the ChlP-Seq field con- 
cerns the nature of the control DNA used for scoring ChlP-Seq experiments. 
Although most groups use input DNA, it is still unsettled whether input 
DNA, normal IgG DNA or, in the case of yeast, ChIP DNA from an 
untagged strain is the preferable reference sample for ChlP-Seq. With the 
read length, read quality, and read quantity improvements of high-through- 
put DNA sequencing technologies, it will be possible to obtain an increased 
total of reads with longer sequence lengths. For yeast ChlP-Seq, this will 
allow an increased multiplex capability. Computational challenges of data 
handling and long-term data storage require high-performance computing 
clusters and are much more complex than ChlP-chip analyses that could be 
performed by most users. The protocols developed for yeast, such as bar- 
coded ChlP-Seq, can be readily extended to lower eukaryotes and eventually 
to higher eukaryotes with the advent of higher capacity DNA sequencers. 
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Abstract 

The packaging of eukaryotic genomes into chromatin has wide-ranging influ- 
ences on all DNA-templated processes, from DNA repair to transcriptional 
regulation. The repeating subunit of chromatin is the nucleosome, which com- 
prises 147 bp of DNA wrapped around an octamer of proteins. Positioning of 
nucleosomes relative to underlying DNA is a key factor in the regulation of 
gene transcription by chromatin, as DNA sequences between nucleosomes 
are more accessible to regulatory factors than are DNA sequences within 
nucleosomes. Here, I describe protocols for mapping nucleosome positions 
across the yeast genome. 




1. Introduction 

The yeast genome, like that of all eukaryotes, is packaged into a 
nucleoprotein complex known as chromatin. The repeating subunit of 
chromatin is the nucleosome, which consists of 147 base pairs (bp) of 
DNA wrapped around an octamer of basic histone proteins. The positioning 
of nucleosomes on underlying DNA has consequences for gene regulation — 
nucleosomal occlusion of protein-binding sites is generally thought to 
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inhibit protein binding. Nucleosomal positioning has also recently been 
shown to affect sequence evolution, with point mutations fixing at lower 
rates in linker DNA relative to nucleosomal DNA (Sasaki et ah, 2009; 
Washietl et ah, 2008). Though there are clearly DNA sequences that favor 
and disfavor nucleosomal incorporation (Ioshikhes et ah, 1996, 2006; 
Iyer and Struhl, 1995; Kaplan et ah, 2008; Segal et ah, 2006; Sekinger 
et ah, 2005; Yuan and Liu, 2008), essentially any DNA sequence can wrap 
around the histone octamer to form a nucleosome. Furthermore, protein 
complexes such as Isw2 regulate gene expression by moving nucleosomes 
away from their thermodynamically favored positions (Whitehouse and 
Tsukiyama, 2006; Whitehouse et ah, 2007). Nucleosome positioning affects 
signal processing during gene regulation — nucleosomal occlusion of key 
promoter sequences has been shown to change the regulatory logic of the 
j6-interferon promoter from an OR to an AND gate for three regulatory 
inputs (Lomvardas and Thanos, 2002), and to separate signaling threshold 
from dynamic range at the PH05 promoter in yeast (Lam et ah, 2008). 
Thus, understanding where nucleosomes are located in the genome has 
implications for a wide range of interesting and important aspects of genome 
biology. 

A number of features distinguish nucleosomal DNA from the linker 
DNA that intervenes between adjacent nucleosomes. Most notably, 
nucleosomal DNA is relatively protected from cleavage by a variety of 
nucleases. The most common nuclease used for assaying nucleosome posi- 
tions in micrococcal nuclease, which has little (but measureable) sequence 
preference on naked DNA, but has a dramatic preference for linker 
DNA over nucleosomal DNA. Thus, identification of DNA protected 
from micrococcal nuclease digestion has been a mainstay for mapping of 
nucleosome positions in vivo. 

In the past few years, protocols have been developed enabling the 
mapping of nucleosome positions across the entire genome (Albert et ah, 
2007; Field et ah, 2008; Mavrich et ah, 2008a,b; Schones et ah, 2008; 
Weiner et ah, 2009; Yuan et ah, 2005). I will not describe mapping methods, 
such as chromatin immunoprecipitation with an antihistone antibody 
(Bernstein et ah, 2004), or formaldehyde-assisted isolation of regulatory 
elements (Lee et ah, 2004), that do not have single-nucleosome resolution. 
The current state-of-the-art protocols for nucleosome mapping all rely on 
protection from micrococcal nuclease. Isolated DNA is characterized by 
tiling microarray or by "deep" sequencing, providing whole-genome, 
high-resolution data on the packaging state of the yeast genome. The 
protocols described were developed and validated in Saccharomyces cerevisiae, 
but we have used them successfully with over 10 different Ascomycete species 
(not shown), so they can be applied broadly. 
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2. Isolation of Mononucleosomal DNA 

• Solutions (make beforehand): 

Buffer Z: 

1 M sorbitol 

50 mMTris-HCl, pH 7.4 

NP buffer: 

1 M sorbitol 
50 mM NaCl 
lOmMTris, pH 7.4 
5 mM MgCl 2 
1 mM CaCl 2 

Glycine: 

2.5 M glycine 

Spermidine: 

250 mM in water; ~500 jA aliquots can be stored at —20 °C, and can be 
repeatedly freeze/thawed. 

MNase: 

We always obtain MNase from Worthington. MNase is resuspended at 
20 units//il in 10 mM Tris, pH 7.4, and stored in ~50 jA aliquots 
at —80 °C — can be freeze/thawed several (at least 2—3) times, store in 
— 20 °C after first use of an aliquot. 

Proteinase K: 

20 /ig//il; ~500 jA aliquots can be stored at — 20 °C, and repeatedly freeze/ 
thawed. 

• 6x Orange G loading buffer: 

60% glycerol 

1.5 mg/ml Orange G (Sigma 0-1625) 

• Day 0: 

Inoculate a 2—5 ml culture of YPD (or any other media) with the cells of 
interest— typically S288C derivatives BY4741 (MATa) or BY4742 
(MATalpha). 

• Day 1: 

Late PM: Count # cells/ml, calculate inoculation (BY4741 has a generation 
time of r\j 90 min in YPD) to have proper OD the following AM. 
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Inoculate 444 ml YPD in a 2-1 flask, grow shaking 200 rpm at 28 °C. 
We would typically inoculate ~7 jA of saturated BY4741 into this 
444 ml culture at 7 p.m. for a mid-log OD the next day around 11 a.m. 

We prefer to use shaking water bath incubators when possible, since water 
bath incubation is less susceptible to temperature variability than incuba- 
tion in air incubators. We have found an approximately twofold differ- 
ence in doubling time between different clamp positions in a typical 
shaking incubator using blown air for heating (cultures near the fan 
generally grow faster, and are likely warmer), while no such difference 
is seen between different positions in a shaking water bath. 

• Day 2: 

Check to make sure all water baths are at the correct temperature and have 
enough water in them. 

Grow yeast culture to desired OD 600 (usually ~0.8). 

Add 24 ml formaldehyde (37% as purchased) — to a final of 2%. We have 
found little difference between 1% and 2% formaldehyde in nucleosome 
positioning results, but we find better yields from experiments with 2%. 
This may be related to the spheroplasting step, but whatever the reason, 
we find optimal nucleosomal yields with 2%. Also, we always use 
formaldehyde within a month of purchase. 

Incubate 30 min, 28 °C/shaking. In general, whatever the growth condi- 
tion was before fixation, fix cells in those conditions. 

Pour culture into a 1-1 centrifuge bottle containing 24 ml 2.5 M glycine 
(to a final cone, of ~ 125 mM) to react with remaining formaldehyde. 
If going straight to spheroplasting (i.e., no waiting for other cultures) the 
glycine can be omitted. 

Spin (4000 rpm, SLA-3000 rotor, 5 min). Pour off supernatant and resus- 
pend cell pellet in ~ 50 ml ddH 2 in a 50 ml conical. Pellet cells again 
(table top Sorvall, 3700 rpm, 2 min). 

Cells can now be left on ice as long as necessary (we have tested up to 3 h, 
with no change in positioning data) for other cells to "catch up," etc. 

If necessary make Buffer Z. 

Make zymolyase solution (10 mg/ml in Buffer Z). Zymolyase is not 
particularly soluble at this concentration, so shake well right before 
adding to yeast to evenly resuspend. We obtain zymolyase from Seigaku, 
distributed by Cape Cod Associates in the United States. Zymolyase 
solution lasts roughly a week, so should be made up fresh more or less 
every experiment. 

Resuspend each cell pellet (from 450 ml of cells) in 39 ml Buffer Z. Add 
28 jA jS-mercaptoethanol (14.3 M, final cone. 10 mAi) to each conical. 
Cap conical, and vortex cells to resuspend — it is important to get cell 
pellet fully resuspended. Add 1 ml Zymolyase (10 mg/ml). Incubate at 
28 °C, shaking, for 30—35 min. 
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Note that the spheroplasting step is somewhat dependent on growth con- 
ditions, and on the strain/species. For example, 5. cerevisiae in stationary 
phase are harder to spheroplast than mid-log yeast. Success in this step 
can be assessed directly by microscopy (spheroplasts will lyse in water, 
intact yeast will not), but will also be readily apparent when nucleosomal 
DNA is isolated — unspheroplasted cells will result in a band of undi- 
gested large fragments of genomic DNA. If a small fraction of cells are not 
spheroplasted, this does not seem to be a problem beyond lost yield, as 
mononucleosomal DNA is gel-purified away from this genomic con- 
tamination in any case. But we prefer to have > 90% spheroplasted cells 
for experiments we care about (i.e., nontroubleshooting). 

Meanwhile, add sensitive components to NP buffer: to 5 ml NP buffer 
add 10 jA 250 mM spermidine (final cone. 500 juM), 3.5 jA of 
j6-mercaptoethanol diluted 1:10 in water (final cone. 1 mM), and 
37.5 jA 10% NP-40 (final cone. 0.075%). 

During spheroplast spin (below), aliquot micrococcal nuclease (MNase) to 
three or four Eppendorf tubes. Put a dot of MNase on the side of the tube 
(halfway down tube). 

Suggested concentration range (1 flask of mid-log cells, 444 total ml, 
OD 600 > 0.8): 3, 6, and 10 jA of MNase. If doing four titration steps, 
1, 1.8, 3.5, and 6 /A would be our starting MNase levels (since spher- 
oplasts will be more dilute if dividing into four aliquots). Information 
from the first titration will guide further titrations. 

After zymolyase digestion, pellet cells (4900 rpm in a Sorvall RC3 centri- 
fuge, or ~7000X£, 10 min, 4 °C), or if absolutely necessary on tabletop 
Sorvall, 3700 rpm, 10 min. Aspirate supernatant — be very careful, as 
spheroplast pellets are fluffy and will suck into the aspirator. We generally 
try to get almost all the supernatant while losing a small amount of the 
spheroplasts (a small amount of supernatant will not affect the MNase 
reaction). Resuspend cells in ~ 2 ml (600 /A for three titration steps + 
200 /A) NP buffer by pipetting up and down several times. Make sure 
pellet is resuspended. 

Add 600 /A of cells to each Eppendorf already carrying micrococcal nuclease. 
Add the cells to the tubes directly over the spot of nuclease, and try to be 
even and quick between tubes. I generally start with the most dilute 
sample and use the same PI 000 tip to add to each successive tube to 
minimize tip changing times. 

As soon as all the cells have been added, start the timer, close the tubes, 
invert them once to mix, and incubate at 37 °C (water bath) for 20 min. 

About 5 min before the reaction ends, make 5x STOP buffer — to make 
1 ml add 400 jA water, 500 jA 10% SDS, and 100 /il 0.5 MEDTA. STOP 
buffer needs to be made fresh every time since it precipitates during 
storage. 
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Stop the reaction after 20 min by adding 150 fA of (5x) STOP to each 

digestion. 
Add 6 fA proteinase K (20 fig/ fA). Invert tubes to mix. Incubate at 65 °C 

overnight to digest proteins and to reverse formaldehyde cross-links. 

• Day 3: 

Remove tubes from 65 °C. 

Phenol: chloroform (P:C) extract once. We do this as below to maximize 
DNA yield, but any P:C extraction should work here. 

Add ~1 volume (800 fA) phenol: chloroform: I AA, mix, centrifuge 5 f in 
Eppendorf ' 'heavy" phase lock tubes. 

Phase lock tubes — spin at 16,000 Xg for 30 s to bring down gel. Add 
phenol/aqueous mix, invert repeatedly to form homogeneous solution 
(do not vortex). Spin at 1 6,000 x^ for 5 min. Use P200 pipette to recover 
DNA-containing aqueous layer above phase lock gel. 

Add 1/10 volume (75 fA) 3 M sodium acetate, pH 5.5. 

Add 1 volume isopropanol (fill to top) to precipitate DNA, freeze at 
— 20 °C for 30 min, spin at max speed in microcentrifuge for 10 min. 
Aspirate supernatant — pellet may be loose, wash with 500 fA ice-cold 
70% ethanol, spin 5 more min. Aspirate — be especially careful about 
pellet in this step. Air dry. 

Resuspend pellet in 60 1 NEB buffer 2 (dilute commercial stock 1:10!), 
incubate pellet at 37 °C for 1 h (or leave pellet at 4 °C overnight). 

Add 1 1 DNase-free RNase (10 fig/ fA stock, Roche), incubate at 37 °C for 
1 h. 

Run products on 1.8% agarose gel to find which part of titration worked the 
best. Loading ~ 1 fA is usually sufficient. Use 6 X Orange G loading 
buffer for all nucleosomal DNA, as the dye front in standard DNA 
loading buffer runs right on top of the mononucleosomal band in this 
percentage gel. 

Take gel images — see Fig. 5.1. Beware of the bright fuzzy band that runs 
with the Orange dye front — it is not mononucleosomes! 

Take best titration (~80% mononucleosomal DNA with ~20% dinucleo- 
some and a ghost of trinucleosome — see below) and load entire fraction 
onto approximately three lanes of a fresh 1.8% gel. Using a brand new 
razor blade, gel purify mononucleosomal band away from dinucleo- 
somes, and away from "puff of degraded DNA/RNA (see Fig. 5.1). 
As always with gel purification, trim band of excess agarose, but also 
attempt to minimize UV exposure of DNA. 

If there is a lot of degraded DNA (running < 100 bp), first purify the DNA 
with a Qiagen cleanup column to remove the small pieces. Otherwise, 
the gel may not run correctly (bands will not separate). Resuspend entire 
samples (combined titrations) into 45 fA of elution buffer — will fit into 
two lanes. 
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Figure 5.1 Typical MNase titration series. Agarose gel analysis of digestion products 
isolated from yeast treated with varying amounts (triangle, top) of micrococcal nuclease 
(MNase). Left lane is 100 bp marker. Features of interest are indicated with arrows. 



Use BioRad Freeze-N-Squeeze tubes to purify DNA from gel. We use this 
method as typical gel purification kits get abysmal yields from DNA in 
this size range. Mince band, freeze for 5 min at — 20 °C, spin at 13,000 Xg 
for 3 min. We typically add another 50—100 fA of TE on top of the 
squished gel in the tube and spin again to increase yield, but this is not 
necessary. 

To clean up DNA, add 1 volume phenol: chloroform: IAA, vortex, remove 
the aqueous phase to a new tube. Repeat (P:C extract twice). 

Add 1 fA of glycogen (20 mg/ml, from Mytilus edulis, Sigma G1767) to assist 
in precipitation. Add 1/10 volume of 3 M Na acetate (pH 5.5), 2.5—3 
volumes 100% EtOH. Incubate at —20 °C for 20—30 min, spin at max 
speed for 10 min, gently aspirate off supernatant, being careful not to 
disturb pellet. 

Wash with 500 jA of ice-cold 70% EtOH to remove salt. 

Spin max speed for 5—10 min. Aspirate supernatant, and air dry pellet. 

Resuspend pellet in 25 fA of H 2 0. Pellet may not always completely 
dissolve — possibly due to glycogen? In that case, brief spin and take 
supernatant above slurry. 

Run 1 fA of mononucleosome prepared on a 1.8% gel after cleanup to 
confirm the DNA is there, clean, etc. 

This DNA can now be labeled for microarray analysis, or ligated to 
linkers to generate a deep sequencing library. 



112 



Oliver J. Rando 




3. Variation in Titration Level Used for 
Nucleosome Purification 



For years we have used nucleosomal DNA from a titration step with 
~ 10—20% dinucleosome. The rationale here is that overdigestion ofmono- 
nucleosomal DNA leads to a slow, progressive trimming of nucleosomes, 
so the digestion level of a titration step with only mononucleosomal DNA 
is hard to judge, and hard to reproduce. Using deep sequencing, we have 
more recently characterized the nucleosomal populations from over and 
underdigested chromatin as well — see Fig. 5.2 (Weiner et ah, 2009). 

Nucleosome maps from the three titration steps broadly agree, with 
differences between them largely, though not entirely, being changes in 
occupancy rather than positions of nucleosomes. For example, the first and 
last nucleosomes in coding regions are often relatively highly occupied in 
the underdigested map, presumably thanks to increased accessibility to 
MNase of exposed DNA adjacent to nucleosome-depleted promoters. 
Moreover, so-called ' nucleosome-free regions," especially the longer 
ones characteristic of very highly expressed genes, tend to be partially 
occupied in underdigested chromatin, suggesting that these regions are in 
fact occupied by loosely bound histones that are readily digested away. 

What does this mean for mapping studies? Most importantly, when com- 
paring two samples it is crucial to record the gels from which the nucleosomal 
DNA was isolated. Artifacts are a potential pitfall in data analysis — an accidental 
difference in digestion levels between wild-type and some mutant would give 
the false impression that promoter nucleosome occupancy was affected by the 
mutation of interest. Another way to determine whether chromatin has been 
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Figure 5.2 Effect of digestion level on nucleosome maps. A. Gel, as in Figure 5.1, 
showing an MNase titration from which 3 mononucleosome bands have been excised 
(indicated with boxes), corresponding to under-, well, and over-digested chromatin, 
from left to right. B. Chromatin maps differ depending on digestion level. Deep seqeucn- 
ing data for the three nucleosome preps in A were normalized, and data for all genes 
aligned by transcription start site (TSS) are averaged for each dataset. 
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digested to similar extents after sequencing is to average nucleosomal data for 
all genes by the transcriptional start site (Fig. 5.2B), since in this view under- 
and overdigested chromatin have characteristic patterns that differ somewhat 
from our preferred titration level. 

Finally, it is worth noting that this variation in occupancy at different 
digestion levels could potentially be utilized as a structural probe for chro- 
matin in vivo, and some experimenters may find mapping over a range of 
digestion levels a valuable tool in their studies. 




4. Labeling of Mononucleosomal DNA 
for Tiling Microarray Analysis 

Note that this protocol describes labeling and hybridization on 
"homemade" tiling microarrays. For commercial microarrays, labeling, 
and hybridization should be carried out according to manufacturer's pro- 
tocols. This is particularly true for Affymetrix microarrays. On the other 
hand, we have used this protocol quite successfully with Nimblegen and 
Agilent microarrays, with differences only occurring after cleanup of the 
labeled material. After cleanup, we add the blocking and hybridization 
solutions recommended by the manufacturer, then carry out hybridization 
as described in their protocols. 

• Solutions (make beforehand): 

2.5x random primer mix (can be made, or can use the buffer from 
Invitrogen's "BioPrime Klenow" labeling kit): 

125mMTris, pH 6.8 

12.5mMMgCl 2 

25 mM j6-mercaptoethanol 

750 /ig/ml random octamers 

lOx dNTPmix: 

1.2 mMdATP, dGTP, dTTP 

0.6mMdCTP 

10 mMTris (pH 8.0), 1 mMEDTA 

tRNA: 

5 mg/ml in water 
Store aliquots at —80 °C 

Poly- A RNA: 

5 mg/ml in water 
Store aliquots at —80 °C 
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4.1. Protocol 

Add 2—3 fig of nucleosomal DNA to a 0.5-ml Eppendorf, and bring the 
volume up to 21 fA. In another tube, do the same with 2—3 fig of sheared 
genomic DNA. 

Add 20 fA 2.5 X random primer mix. 

Boil 5 min (we use a heat block at 95 °C with water in the tube holders), 
place on ice. 

After 5 min on ice, add: 

- 5 fA dNTP mix 

- 3 fA Cy3-dCTP (to nucleosomes) or Cy5-dCTP (to gDNA) — the 
colors can be swapped, but we typically run experiments in this 
direction 

- 1 fil high-concentration Klenow. 
Place tubes at 37 °C for 1 h. 

Add another 1 fA of Klenow. 

Incubate another hour at 37 °C. 

Stop reactions with 5 fA 0.5 MEDTA. 

Mix the two colors, add 400 fA TE (pH 8.0 or 7.4) to stopped reactions, and 

add to Microcon 30 filter. 
Spin 10—11 min at 10,000 rpm in microcentrifuge, until ~30— 50 fA remain, 

solution should be darkly colored. 
Add another 400 fA TE. 
Add 100 fig yeast tRNA (Sigma). 
Add 20 fig poly-A RNA. 

Spin 12—13 min at 10,000 rpm, until volume is less than 40 fA. 
Recover labeled DNA by inverting Microcon filter in a fresh collection 

tube, and spinning 10,000 rpm for 1 min. 
Measure volume of recovered DNA and transfer to a 0.5-ml Eppendorf 

tube. 
Bring volume up to 40 fA with water. 
Add8.5/il20x SSC. 
Add 1.5 fA 10% SDS (make sure not to add any more SDS than this — we 

typically touch the edge of the pipette tip against a clean plastic surface to 

wipe away any SDS stuck to the outside of the pipette tip). 
Boil hybridization mix for 2 min (we use a heat block at 95 °C with water in 

the wells to ensure good heat transfer). 
Remove tubes and let them sit at room temperature (in the dark, either in 

foil or in a drawer) for 10—15 min. 
Give the tube a quick spin, apply to microarray, and hybridize for 12—16 h 

at 65 °C. 
Wash microarray and scan. 
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5. Generation of Nucleosomal DNA Libraries 
for Deep Sequencing 

Over the past few years, ultrahigh-throughput "deep" sequencing 
methods have been effectively used for the analysis of mononucleosomal 
samples. As of this writing, two major commercial machines have been used 
for deep sequencing — 454 (Roche) and Solexa (Illumina). We have expe- 
rience with Solexa sequencing and so present a protocol for Solexa library 
construction, but use of other sequencing methodologies will simply 
require creating libraries per the instructions of the manufacturer. We use 
Solexa because provides several million reads 36 bases in length — 36 bases is 
long enough to uniquely identify the vast majority of sequences in the yeast 
genome, and given that there are ~ 60,000 nucleosomes in yeast, 3 million 
reads provides 50 X coverage of each nucleosome, allowing occupancy 
changes to be assessed between conditions. Solexa machine upgrades 
allow paired-end reads, and longer reads, and neither of these requires a 
change in the basic protocol. 

Start with DNA from MNase titration, after RNase treatment, but prior to 

gel purification. 
Clean up MNase titration with Qiagen MinElute (elute in 50 jA EB). 
Treat DNA with alkaline phosphatase (CIP — NEB M0290L) for 1 h at 

37 °C. 
Gel purify mononucleosomal DNA on 1.8% agarose gel, as described 

above. Use BioRad Freeze-N-Squeeze tubes to purify DNA from gel. 

Mince band, freeze for 5 min at — 20 °C, spin at 13,000 Xg for 3 min. We 

typically add another 50—100 jA of TE on top of the squished gel in the 

tube and spin again to increase yield, but this is not necessary. 
Repair DNA ends using End-It DNA End-Repair Kit (Epicentre 

Biotechnologies ER0720): 



DNA 


150 ng 


End-It buffer 


5 jA 


dNTP mix (2.5 mM) 


5 jA 


ATP (10 mM) 


5 jA 


End-It enzyme mix 


1 jA 




Q/S to 50 jA 



Incubate at room temperature for 1 h. 

Clean up reaction with Qiagen MinElute column, elute in 30 /A EB. 

Klenow exo- (Epicenter Biotechnologies KL06041K): 
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DNA 


30 fil 


Klenow buffer 


5 jA 


dATP (10 mM) 


1 jA 


Water 


14 jA 


Klenow exo- 


1 jA 



Incubate at room temperature for 45 min. 

Clean up reaction with Qiagen MinElute column, elute in 20 /A EB. 

Dry DNA down to about 10 fil in a Speed Vac, then bring to exactly 10 jA 
with water. 

Ligate Illumina adapters to polished mononucleosomal DNA using Fast- 
Link DNA Ligation Kit (Epicenter Biotechnologies LK0750H): 



DNA 


10 jA 


Fast-Link Ligation Buffer 


1.5 jA 


ATP (10 mM) 


0.75 }A 


Ligase 


1 /A 


Genomic adapters 


2 jA 



Incubate at room temperature for 1 h, then add: 



Water 


7.5 jA 


Fast-Link Ligation Buffer 


1 /A 


ATP (10 mM) 


0.5 jA 


Ligase 


1 fil 



Incubate at 16 °C overnight. 

Clean up reaction with Qiagen MinElute column, elute in 30 fil EB, 

Amplify library with Pfx polymerase (Invitrogen 11708-039): 



DNA 


30 fil 


Pfx buffer 


10 /A 


Illumina genomic primers (1.1 and 2.1) 


1 fil each 


dNTPs (10 mM) 


3 fil 


MgS0 4 (50 mM) 


2 fil 


Water 


53 fil 


Pfx 


1 /A 
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Step number 


Temperature (°C) 


Time 


1 


94 


2 min 


2 


94 


15 s 


3 


65 


1 min 


4 


68 


30 s 


5 


Go to step 2 


18 times 


6 


68 


5 min 


7 


4 


Forever 



Gel-purify library on 1.5% agarose gel. Band should run ~250 bp. 
We sometimes see two bands here, and in our experience the smaller 
band corresponds to primer dimer, and if it is an issue it can be eliminated 
by gel-purifying DNA after the primer ligation step above, prior to 
PCR. 

Freeze-N-Squeeze purify mononucleosome band from gel as above. 

Clone a small portion (~100 ng) of the library using TOPO cloning, and 
transform into BL21DE3 competent Escherichia Coli. 

The next day, isolate 20 colonies, and isolate plasmids by miniprep method of 
your choice. Send these 20 library inserts out for sequencing (or sequence 
by hand). Inserts should average ~ 120 bp for our typical MNase digestion 
level, should map to the yeast genome, and should contain no mito- 
chondrial sequences. If we get more than three inserts which are simply 
primer dimers, we remake the library from mononucleosomal DNA. 

If library looks OK, submit 30 jA of 10 nM sample in EB to your Solexa 
operator. 
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Abstract 

We present a detailed protocol for ribosome profiling, an approach that we devel- 
oped to make comprehensive and quantitative measurements of translation in 

yeast. In this technique, ribosome positions are determined from their nuclease 
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footprint on their mRNA template and the footprints are quantified by deep 
sequencing. Ribosome profiling has already enabled highly reproducible measure- 
ments of translational control. Because this technique reports on the exact position 
of ribosomes, it also revealed the presence of ribosomes on upstream open reading 
frames and demonstrated that ribosome density was higher near the beginning of 
protein-coding genes. Here, we describe nuclease digestion conditions that pro- 
duce uniform ~28 nucleotide (nt) protected fragments of mRNA templates that 
indicate the exact position of translating ribosomes. We also give a protocol for 
converting these RNA fragments into a DNA library that can be sequenced using the 
lllumina Genome Analyzer. Unbiased conversion of anonymous, small RNAs into a 
sequencing library is challenging, and we discuss standards that played a key role 
in optimizing library generation. Finally, we discuss how deep sequencing data can 
be used to quantify gene expression at the level of translation. 




1. Introduction 

Gene expression is now measured routinely to characterize the physio- 
logical state of cells and to determine the molecular basis of cellular function and 
dysfunction. Gene expression profiling typically uses mRNA abundance, which 
can be measured easily, as a proxy for protein production, which is the ultimate 
effect of gene expression. However, these measurements of mRNA abundance 
are blind to regulation of protein translation, and there is clear interest in 
approaches for making comprehensive measurements of protein synthesis. 
Translational control plays a major role in cellular stress responses in yeast 
(Hinnebusch, 2005), and homologous pathways are important in mammals as 
well (Holcik and Sonenberg, 2005) . Translation is also regulated in development 
and differentiation (Sonenberg and Hinnebusch, 2009), including the establish- 
ment of mother/ daughter asymmetry in yeast (Chartrand et ah, 2002; Gu et ah, 
2004) as well as the filamentous growth response (Gilbert et ah, 2007). 

Microarrays (Brown andBotstein, 1999), and more recently deep sequenc- 
ing (Mortazavi et ah, 2008; Nagalakshmi et ah, 2008), have allowed rapid and 
comprehensive measurements of mRNA levels. Polysome profiling emerged 
as a technique for measuring translation with microarrays (Arava et ah, 2003; 
Johannes et ah , 1 999; Zong et ah , 1999) by fractionating transcripts according to 
the number of bound ribosomes and analyzing the distribution of mRNAs in 
the resulting fractions. Increases or decreases in the amount of protein synthe- 
sized per mRNA will be reflected in the number of ribosomes bound, so 
translational regulation will shift the distribution of a message between different 
fractions. This polysome profiling approach has provided measurements of 
genome -wide translation in yeast, most notably in response to starvation (Preiss 
et ah, 2003; Smirnova et ah, 2005). However, the imprecision in polysome 
fractionation, especially for large numbers of ribosomes, limit the quantitative 
resolution of polysome profiling. More fundamentally, this approach cannot 
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distinguish ribosomes that are translating protein-coding genes from those on 
upstream open reading frames (uORFs) (Arava et ah, 2005). 

In this chapter, we present a protocol for a ribosome profiling, a technique 
for making quantitative and high-resolution data on all cellular translation. We 
have used ribosome profiling to measure basal translational efficiency as well as 
translational regulation and to characterize known and novel sites of nonca- 
nonical translation (Ingolia et ah, 2009). Ribosome profiling combines the 
classic observation that the nuclease digestion footprint of a ribosome on an 
mRNA message indicates its exact position (Steitz, 1969; Wolin and Walter, 
1988) with recent advances in ultra high-throughput sequencing (Bentley et ah , 
2008) that allow the analysis of millions of footprints in parallel. We describe the 
generation of ribosome footprints as well as the techniques for converting them 
into a deep sequencing library. We also discuss the analysis of ribosome 
footprint sequencing data, focusing on measurements of gene expression. 




2. Ribosome Footprint Generation and 
Purification 

Ribosome profiling requires the preparation of cell extracts containing 
mRNA-bound ribosomes. One major concern in preparing these extracts is 
ensuring that the polysomes recovered in the extract reflect the physiological 
status of translation in the living yeast. However, yeast alter translation very 
quickly in response to environmental changes such as removal of nutrients (Ashe 
et ah , 2000; Barbet et ah , 1 996) , and cells must be removed from growth media to 
prepare extracts. To minimize perturbations of in vivo translation profiles, poly- 
somes are stabilized by adding cycloheximide to cells immediately before they 
are harvested. Cycloheximide is a translation elongation inhibitor that immobi- 
lizes ribosomes (Godchaux et ah , 1 967; McKeehan and Hardesty, 1 969) , thereby 
preserving a snapshot of their location at the time of cycloheximide addition. 
Following cycloheximide addition, care is also taken to freeze cells in liquid 
nitrogen as quickly as possible. If cells are harvested and frozen quickly, ribosome 
footprinting can be performed on cycloheximide-free extracts. Footprinting in 
these drug-free extracts shows a marked depletion of ribosome footprints from 
the 5' region of protein-coding genes (Ingolia et ah, 2009), consistent with the 
idea that initiation is rapidly blocked during cell harvesting and that cyclohexi- 
mide prevents ribosome run-off during extract preparation. 

Ribosome footprints are generated by nuclease digestion of polysomes 
in cell extracts. Nuclease treatment degrades the mRNA that links together 
polysomes, freeing individual ribosomes (Fig. 6.1). These ribosomes, 
bound to ~28 nucleotide (nt) protected mRNA fragments, are isolated 
by sucrose density gradient centrifugation. The footprint mRNA fragments 
themselves are then purified through two size selection steps. 
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Figure 6.1 Schematic of footprint fragment generation. Polysomes, consisting of 
actively translating ribosomes bound to mRNA templates, are nuclease digested 
to degrade transcripts. Ribosomes protect ~28 nt mRNA footprint from nuclease 
digestion. After digestion, footprint-bound ribosomes are isolated and footprint 
fragments are recovered. 



2.1. Extract preparation 

Grow a 25-ml overnight starter culture in YEPD and prewarm growth 
media to 30 °C. Prepare 750 ml prewarmed YEPD in a 2800-ml baffled 
flask and inoculate it with yeast from the starter culture to an initial OD 600 
of roughly 0.03. Grow cells at 30 °C with 250 rpm shaking to a final OD 600 
of 0.6— 0.7, which requires 7—8 h for the standard laboratory strain S288C. 
The growth conditions can be modified to measure the translational effects 
of different perturbations provided that the same final quantity of cells is 
available. 

To freeze cells as quickly as possible, prepare for cell harvesting before 
adding cycloheximide to the culture. Make the polysome lysis buffer (see 
Section 5.1 for all solutions) and chill it on ice. Fill a 50-ml conical tube with 
liquid nitrogen and leave it partly immersed in liquid nitrogen. Pierce the 
cap of the tube several times with a 20-gauge needle to allow nitrogen vapor 
to escape. 

Add 1.5 ml cycloheximide from a 50-mg/ml stock in ethanol to reach a 
final concentration of 100 /ig/ml. Continue cell growth for 2 min, mixing 
well, to allow the cycloheximide to act. Harvest cells by filtration, which 
provides rapid and complete removal of media while minimizing perturba- 
tions before cells are actually frozen. Use a 90-mm cellulose nitrate filter 
with a 0.45 /im pore size (Whatman 7184-009) in a vacuum filtration 
apparatus with a fritted glass support (Kontes 953755-0090). Prewarm the 
funnel with a small amount of prewarmed growth media and then harvest 
cells by filtration. As soon as the last liquid has passed through the filter, 
remove the funnel and scrape the cells from the filter with a metal spatula, 
taking care not to tear the membrane. Resuspend the cells into a slurry 
in 2.5 ml of ice-cold polysome lysis buffer as quickly as possible, using a 
1000-/il pipette to mix and then pipette cells. Drip the cell slurry into the 
liquid nitrogen-filled 50 ml conical and cap it, taking care to ensure that 
the cap is pierced as described above. Place the conical tube upright in a 
— 80 °C freezer to allow the liquid nitrogen to evaporate. 
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Lyse yeast by cryogenic grinding in a mixer mill (Retsch MM301). The 
principal concern throughout the lysis is to avoid thawing the extract, and 
this is addressed by chilling all equipment in liquid nitrogen. Begin by 
opening a stainless steel grinding chamber and then immersing it and the 
grinding ball in liquid nitrogen until the nitrogen stops boiling vigorously. 
Remove the chamber using tongs and pour out all residual liquid nitrogen. 
Place the frozen yeast pellets and the grinding ball into the chamber and 
screw it shut tightly. Return the sealed chamber to liquid nitrogen until it is 
fully rechilled and boiling again stops. Remove the chamber, loosen it one- 
quarter turn, and mount it on the mixer mill. Grind the sample for 3 min at 
15 Hz, then quickly remove the chamber, retighten the seal, and chill it in 
liquid nitrogen until the nitrogen stops boiling around the chamber. Repeat 
this process for six total cycles of grinding. After the last grinding cycle, chill 
the chamber as well as a metal scoop. Fill a new 50 ml conical tube with 
liquid nitrogen and use the chilled metal scoop to transfer cell powder from 
the chamber into the conical tube. Keep the scoop cold and, if the recovery 
of the powder takes too long, reseal and rechill the sample chamber as 
well. Again place the conical tube upright at —80 °C to allow liquid 
nitrogen to evaporate. Use caution, as the powder is much more easily 
dispersed by boiling liquid nitrogen than the frozen cell droplets. 

Thaw the cell powder gently and, as soon as it is fully thawed, spin the 
tube 5 min at 3000 Xg in a 4 °C centrifuge to collect unbroken cells and large 
debris. Remove the supernatant to chilled 1.5 ml microfuge tubes on ice 
and clarify the lysate by spinning 10 min at 20,000 Xg in a 4 °C centrifuge. 
Recover the supernatant, avoiding both the pellet and the lipid layer at 
the top of the tube. Find the concentration of the extract by measuring the 
^260 of a 200-fold dilution. Dilute the extract with lysis buffer to achieve 
an undiluted A 2 eo of 200. At this point, aliquots of the extract can be frozen 
in liquid nitrogen and stored for at least 6 months at — 80 °C. 

2.2. Nuclease digestion and monosome purification 

Nuclease digestion must fully remove unprotected mRNA to produce 
footprints that precisely define the position of the ribosome. Excessive 
digestion can degrade the ribosome, which contains an essential RNA 
component, potentially resulting in the loss of footprint fragments. Tests 
of commercially available nucleases showed that Escherichia coli RNase I was 
best able to produce ribosome footprint fragments of uniform size. The 
concentration of RNase I in the digestion reactions was optimized to 
produce ~28 nt footprint fragments (Ingolia et ah, 2009). The RNase 
inhibitor SUPERase-In is effective against RNase I, and for this reason it 
is used to stop the footprinting digestion. 

Gently thaw a 250- fA aliquot of cell extract and add 7.5 jA RNase I 
100 U//il (Ambion AM2294). In parallel, add 5.0 fA SUPERase-In 
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(20 U//il; Ambion AM2694) to another aliquot. Incubate the extract 
samples for 1 h at room temperature with gentle rotation. Stop digestion 
by adding 5.0 jA SUPERase-In to the RNase-treated sample and keep both 
samples on ice. Load samples onto sucrose gradients for monosome purifi- 
cation as soon as possible after stopping the digestion. 

Prepare 10% and 50% (w/v) sucrose solutions in polysome gradient 
buffer. Chill the ultracentrifuge rotor at 4 °C. Prepare gradients in Sw41 
14 X 89 mm ultracentrifuge tubes. Mark the tubes according to the marker 
block and fill with 10% sucrose solution to the mark. Use the cannula to 
underlay 50% sucrose solution to fill the tube entirely. Gently insert the 
rubber cap into the top of the tube at an angle, with the hole entering the 
liquid last to allow air to escape. Form the gradient using a Gradient Master 
(BioComp Instruments) with rotation at 81.5 °C, speed 16, for 1:58. Load 
the gradients into the Sw41 buckets and balance them by adding additional 
10% sucrose solution to the lighter gradient, then return the buckets with 
gradients to 4 °C. Load extract samples onto sucrose density gradients and 
seal the buckets well. Centrifuge gradients for 3 h at 35,000 rpm at 4 °C. 
Retrieve the rotor buckets and store them at 4 °C before fractionation. 

Fractionate sucrose gradients on the Gradient Station (BioComp 
Instruments) at 0.2 mm/s. Measure the v4 2 6o of the collected material 
using a continuous UV monitor (such as the BioRad EM-1 Econo UV 
Monitor) to identify the 80S ribosome peak. Analyze the undigested sample 
first to ensure that it contains intact polysomes and to determine the 
approximate location of the 80S ribosome peak. Then, fractionate the diges- 
ted sample, which has a much larger 80S monosome peak than the 
undigested sample. Collect this peak, which typically has a volume of 
0.8 ml, freeze it, and store at —80 °C. 

2.3. Footprint fragment purification 

Purify RNA from the monosome fraction using the SDS/phenol method. 
Heat acid phenol— chloroform (5:1, pH 4.5; Ambion AM9720) to 65 °C in a 
fume hood. Place the monosome fraction sample at 65 °C as well and add 
SDS to a final concentration of 1% (w/v). Add 1 volume of hot acid 
phenol— chloroform and incubate 5 min at 65 °C, vortexing frequently. 
Place samples on ice for 5 min, then spin 2 min at full speed in a tabletop 
microfuge. Recover the aqueous phase, which should be the upper phase if 
phenol— chloroform was used, but which may be the lower phase if only 
phenol was used due to the high concentration of sucrose in the aqueous 
phase. Phase lock gel should be avoided as the sucrose may make the 
aqueous phase more dense than the gel. 

Reextract the aqueous phase with 1 volume of acid phenol— chloroform, 
vortex for 5 min at room temperature, and spin 2 min at full speed in a 
tabletop microfuge to separate the phases. Carefully recover the aqueous 
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phase, add 1 volume of chloroform— IAA (24:1), and vortex 1 min at room 
temperature. Spin 1 min at full speed in a tabletop microfuge to separate and 
then recover the aqueous phase. 

Purify footprint fragments by filtration through a Microcon YM-100 
microconcentrator (Millipore 42413). Dilute the aqueous phase to 0.50 ml 
with 10 mM Tris-Cl (pH 7) and add 2.0 fA SUPERase-In. Prepare a 
microconcentrator collection tube by placing 2.0 /il SUPERase-In in the 
tube and then insert the YM-100. Load the sample on the microconcen- 
trator and spin at 500 Xg, room temperature, until 400—425 fA flow-through 
has been recovered. This typically takes 20—30 min. Further centrifugation, 
or centrifugation at higher force, will dramatically increase the amount of 
high-molecular-weight RNA that passes through the filter. Precipitate 
RNA from the flow- through (see Section 5.3), using a coprecipitant 
because the total RNA concentration at this point is quite low. 

Resuspend the RNA in 10 /il 10 mM Tris-Cl, pH 7. At this point, 
dephosphorylate the RNA if it is to be converted into a sequencing library. If 
randomly fragmented mRNA is prepared (see Section 2.4), dephosphorylate 
that sample in parallel. Fragments produced by RNase digestion or alkaline 
hydrolysis (see Section 2.4) have either 3 7 phosphoester or 2' , 3' cyclic 
phosphodisester termini (delCardayre and Raines, 1995; Markham and 
Smith, 1952). The phosphatase activity of T4 polynucleotide kinase can 
dephosphorylate either terminus to leave a free 3' hydro xyl (Amitsur et al. , 
1987; Cameron and Uhlenbeck, 1977). Empirically, polynucleotide kinase 
converts a larger fraction of RNase and alkaline hydrolysis fragments into 
suitable substrates for polyadenylation than does alkaline phosphatase. 
To dephosphorylate RNA fragments, denature RNA 2 min at 80 °C, then 
place on ice. Add 2.0 fA 10 X T4 polynucleotide kinase reaction buffer 
without ATP, 0.5 jA SUPERase-In, 6.5 jA water, and 1.0 jA T4 polynucleo- 
tide kinase (10 U/jA; New England Biolabs M0201S). Incubate 1 h at 37 °C. 

Perform polyacrylamide gel purification for size selection of mRNA 
footprint fragments from the mix of digested rRNA. Add 20 fA 2x denatur- 
ing gel loading buffer (Invitrogen LC6876) to the dephosphorylated foot- 
print sample. Prepare samples of 1.0 fA 10 bp DNA ladder (1 fig/ fA, 
Invitrogen 10821-015), and of 1.0 fA 20 fiM marker oligo, each diluted to 
10.0 fA, with 10.0 fA 2x denaturing gel loading buffer. Denature all samples 
2 min at 80 °C and place on ice. Load a 15% polyacrylamide denaturing 
gel and perform electrophoresis to separate fragments of roughly 30 nt. 
For precast polyacrylamide mini-gels (Invitrogen EC6875BOX), 65 min at 
200 V resolves the footprint fragments well. Stain the gel for 3 min in SYBR 
Gold (diluted from 10,000 X in 1 X TBE; Invitrogen SI 1494), then visualize 
by UV transillumination. Excise the region near the 28 nt marker oligonu- 
cleotide (Fig. 6.2) and recover RNA from the gel slice (see Section 5.4). 
Excise the marker oligonucleotide as well and carry it through the library 
generation protocol in parallel with the footprint sample. It will serve as a 
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Figure 6.2 Size selection of footprint fragments. (A) A denaturing 15% polyacryl- 
amide gel was loaded with denatured 10 bp DNA ladder, two nuclease footprinting 
samples, two samples of randomly fragmented mRNA, and two samples of the 28 nt 
marker oligonucleotide. After electrophoresis, the gel was stained with SYBR Gold. 
The footprinting samples contain an array of characteristic rRNA fragments as well as 
the ribosome-protected mRNA fragments. (B) The gel depicted in (A), after the 
footprint fragment region from 25 to 31 nt was excised from all samples, guided by 
the marker oligo and the ladder. 

positive control for subsequent steps and provide a size marker for gel 
purifications. 

Resuspend the recovered RNA, typically 50-200 ng, in 10.0 jA 10 mM 
Tris— CI, pH 7. A significant fraction of this RNA is digested rRNA. Quantify 
the size-selected RNA using the Agilent BioAnalyzer Small RNA assay. 
Dilute 1.0 fA of the size-selected footprint sample with 4.0 fA RNase-free 
water, and make a second serial dilution of 1.0 fA into 4.0 fA water. One of 
these dilutions should lie within the range of RNA concentrations for 
accurate quantitation. Load the 1:25 dilution before the 1:5 dilution, as 
overloading in earlier samples can distort measurements in later samples. 



2.4. Fragmented mRNA preparation 

Deep sequencing measurements of fragmented mRNA provide a valuable 
control and comparison for ribosome footprint density. Both mRNA and 
ribosome measurements are needed to distinguish translational regulation 
from control that operates at the level of mRNA transcription and stability. 
Many standard methods exist to purify mRNA from yeast. These typically 
involve lysis and total RNA extraction followed by mRNA purification via 
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recovery of poly adenylate d messages or removal of rRNA. The following 
mRNA isolation protocol (Kohrer and Domdey, 1991) is presented as one 
good approach, but alternatives can doubtlessly be substituted. 

Resuspend a small fraction of harvested cells in 600 /A total RNA lysis 
buffer rather than polysome lysis buffer. Cells may be frozen in lysis buffer 
and stored at — 80 °C indefinitely. Lyse cells and extract RNA using the hot 
acid-phenol method. Heat aliquots of acid phenol— chloroform and total 
RNA samples to 65 °C. Add SDS to RNA samples to a final concentration 
of 1% (w/v) and then add 1 volume of hot acid phenol. Proceed with phenol 
extraction as described above for recovering RNA from the purified mono- 
some fraction (see Section 2.3). The total RNA sample will contain much 
more debris than the ribosome fraction and may require additional rounds of 
phenol extraction. After a final chloroform extraction, precipitate the RNA 
(see Section 5.3). The RNA yield will be quite high and a coprecipitant is 
not needed. Determine the yield and purity of RNA by spectrophotometry. 

Purify mRNA from total RNA using oligo-dT-coated magnetic beads 
(Dynabeads mRNA purification kit; Invitrogen 610-06) essentially as 
described by the manufacturer. Prepare 220 jA 1 x binding buffer by 
diluting it from the 2x binding buffer stock. Resuspend the magnetic 
beads by vortexing and take 150 /A beads to a nonstick tube. Collect 
beads by placing the tube on a magnetic rack for 30 s and carefully pipetting 
away the storage buffer. Immediately resuspend beads in 100 jA 1 X binding 
buffer. Repeat this procedure to wash beads again in 1 X binding buffer and 
leave them in binding buffer. Take 150 fig total RNA and dilute to a final 
volume of 50 /A with RNase-free water. Add 50 /A 2x binding buffer and 
denature 2 min at 80 °C, then return immediately to ice and add 1.0 /il 
SUPERase-In. Remove the binding buffer from the magnetic beads and 
resuspend them in the RNA sample. Incubate 5 min at room temperature 
with gentle rotation to allow mRNA to bind to the beads. Return the tube 
to the magentic rack to collect the beads for 30 s and remove the unbound 
sample. Wash the beads twice in 100 /A wash buffer B. Resuspend the beads 
in 20 jA 10 mMTris-Cl (pH 7) and elute the RNA by heating the beads to 
80 °C for 2 min. Immediately place the tube on the magnetic rack for 30 s 
and remove the eluate to a new tube. 

Prepare the mRNA fragmentation reaction by mixing 20 /il of RNA 
with 20 /A of 2 X alkaline fragmentation buffer. Incubate the fragmentation 
reaction for 40 min at 95 °C and return the reaction to ice immediately. 
Prepare 560 /A stop /precipitation solution in a nonstick microfuge tube on 
ice and add the fragmentation reaction to the stop solution. Add at least 
600 jA isopropanol, then precipitate (see Section 5.3). The stop solution 
already contains 300 mM salt. 

Resuspend the fragmented mRNA in 10 /A 10 mMTris— CI (pH 7) and 
then dephosphorylate the fragments and perform gel size selection in parallel 
with the ribosome footprint sample (see Section 2.3). All mRNA fragments 
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will be quite small, so there is no need to filter the fragmentation reaction 
through a microconcentrator. 




3. Sequencing Library Preparation 

Short ribosome footprint RNA fragments must be converted into a 
DNA library suitable for deep sequencing. Sequencing libraries for the 
Illumina Genome Analyzer are pools of DNA molecules with constant 
linker sequences on both sides of the query fragment (Bentley et ah, 
2008). Creating this library requires attaching linkers to both ends of the 
RNA fragment as well as reverse transcription. Any sequence preferences 
in library generation will distort the measured abundance of these frag- 
ments. There are many approaches to library generation, and a standard 
RNA fragment sample is needed to compare between them. A complex 
but well-characterized pool of small RNAs can be obtained by partial 
alkaline hydrolysis of yeast mRNA (see Section 2.4). This high- temperature 
chemical treatment causes relatively uniform fragmentation of RNA, and 
different fragments of the same transcript should be present at roughly 
equal abundance in the sample. Deviations from uniform sequencing read 
coverage across each transcript indicate distortions introduced by library 
generation, so different protocols can be compared directly by quantifying 
the uniformity of sequencing coverage. 

Extensive optimization has resulted in the library generation protocol 
depicted in Fig. 6.3. The constraints on generating a sequencing library 
from ribosome footprint fragments are similar to those that arise when 
working with endogenous small RNAs such as microRNAs (Berezikov 
et ah, 2006). It is necessary to introduce a known sequence on the 3' 
terminus of the RNA fragment to serve as a primer site for reverse tran- 
scription. For RNA fragments produced by nuclease digestion or alkaline 
hydrolysis, this first requires dephosphorylation of the 3' terminus to pro- 
duce a 3' hydroxyl that can serve as a substrate for further manipulation. 
Various RNA ligases have been used to attach known linker sequences to 
microRNAs, but these enzymes have significant sequence preferences. 
Polyadenylation of the 3 7 terminus using E. coli poly- (A) polymerase 
(Fu et ah, 2005) produced more uniform libraries from fragmented RNA 
than any ligase-based approach tested. Among ligase-based protocols, the 
use of truncated Rnl2 to attach an adenylylated linker (Ho et ah , 2004; 
Lau et ah, 2001; Pfeffer et ah, 2005) produces substantially better results than 
Rnll with a phosphorylated linker. 

Attachment of the 5 7 linker requires ligation, but this ligation can be 
performed as an efficient intramolecular reaction of first-strand cDNA using 
an ssDNA ligase. Intermolecular ligation of a second linker to either RNA 
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Figure 6.3 Schematic of sequencing library generation protocol. Small RNA frag- 
ments are polyadenylated to provide a primer-binding site for reverse transcription. 
A custom oligonucleotide containing an anchored oligo-d(T) primer as well as linker 
sequences and a flexible spacer is used to prime reverse transcription. After reverse 
transcription, the first-strand cDNA is circularized to generate a PCR template with 
linker sequences on both sides of the target RNA fragment. 

or first-strand DNA dramatically distorts mRNA coverage and requires 
additional gel purification to recover the ligation product. Circularized 
ssDNA can serve directly as a sequencing template on the Genome 
Analyzer, though limited PCR amplification to produce a conventional 
linear dsDNA library does not introduce significant distortions in the 
relative abundance of sequences in the library. Because the yield of circular 
ssDNA is relatively low and quantification is more difficult, PCR amplifi- 
cation should be performed unless there is specific evidence that it distorts 
the contents of the sequencing library. 



3.1. Polyadenylation 

It is desirable to add a 25—30 nt poly-A tail on each RNA fragment, to 
ensure that the tail is long enough to serve as a primer-binding site but not 
dramatically longer. The extent of polyadenylation is controlled by using 
relatively low amounts of enzyme and limiting the duration of the reaction. 
Under the reaction conditions below, poly- (A) addition to a synthetic 
RNA oligonucleotide substrate is directly proportional to the reaction 
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time and proceeds uniformly across the substrate pool (Feng and Cohen, 
2000). The tail length is relatively insensitive to the concentration of RNA 
at or below 500 nM, the concentration used here, but somewhat longer 
reactions may be required for higher concentrations (Fig. 6.4). 

Prepare 10 pmol RNA and dilute to 9.0 /A with RNase-free water. 
Prepare 2x polyadenylation buffer with ATP as well as polyadenylation 
enzyme mix on ice. Denature samples 2 min at 80 °C and return immedi- 
ately to ice. Add 9.0 /A 2x polyadenylation buffer with ATP and 2.0 jA 
enzyme mix and incubate 10 min at 37 °C. Stop the reaction by adding 
80 jA 5 mMEDTA and precipitating the RNA (see Section 5.3). 



3.2. Reverse transcription 

Resuspend the polyadenylated RNA samples in 10 /A 10 mMTris— CI, pH 7. 
Prepare a template mix with 9.0 jA RNA sample, 1 .0 jA dNTPs 10 mMeach, 
1 .0 /A primer 50 jiM, and 2.5 /A water. Denature 5 min at 65 °C, then return 
to ice. Add 4.0 /A 5x FSB, 1.0 /A SUPERase-In, 1.0 /A 0.1 MDTT, mix 
well, and add 1.0 jA Superscript III. Incubate 30 min at 48 °C. Add 2.3 /A 
1 N NaOH and incubate 15 min at 98 °C. 

Gel purify to separate extended reverse transcriptase products from the 
unextended primer. Prepare a 10% denaturing polyacrylamide gel. Add 
22.5 /A 2x denaturing loading dye to each reverse transcription reaction. In 
parallel, prepare samples with 1.0 /A 10 bp ladder and with 0.5 /A primer 
50 jiM, dilute to 11.3 /A with water, and add 11.3 /A 2 X denaturing loading 
dye. Denature samples 1 min at 95 °C and return immediately to ice. Load 
samples on the gel, using two lanes for each reverse transcription reaction. Run 
the gel under conditions that optimize separation of the unextended primer 
from the footprint samples, which are all between 90 and 130 nt. When using a 
precast denaturing polyacrylamide mini-gel (Invitrogen EC6865BOX), 
65 min at 200 V will resolve reverse transcription products. Stain the gel 
3 min in SYBR Gold and visualize by UV transillumination. Excise the 
extended reverse transcriptase product band (Fig. 6.5), taking care to avoid 
the unextended primer. Recover DNA from the gel slice (see Section 5.4). 



3.3. Circularization 

Resuspend the gel extraction products in 15.0 jA 10 mM Tris— CI, pH 8. 
Add 2.0 jA 10 x CircLigase buffer, 1.0 /A 1 mMATP, 1.0 jA 50 mMMnCl 2 , 
mix well, and add 1.0 /A CircLigase (Epicentre Biotechnologies CL4111K). 
Incubate 1 h at 60 °C, then heat inactivate at 80 °C for 10 min. It is not 
necessary to further purify circularized first-strand cDNA before using it as 
a PCR template. 
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Figure 6.4 Kinetics of polyadenylation reaction. (A) Distribution of oligonucleotide 
sizes from a time-course of polyadenylation. Polyadenylation reactions were prepared as 
described, using 8 pmol control RNA oligo per reaction as a substrate. Reactions were 
carried out at 37 °C for the indicated time, quenched, precipitated, and analyzed using 
the Bio Analyzer Small RNA assay. (B) Quantification of average product length from 
(A). The population of poly adenylate d products was selected based on a length thresh- 
old that excluded 95% of RNA in the unadenylated sample. The average length of the 
polyadenylated products is plotted against the reaction time along with a linear fit 
(33.8 nt + 3.5 nt/min). (C) Quantification of average product length as a function of 
substrate concentration. As in (B), for reactions with different substrate concentrations 
in an 8 min reaction. The average length of polyadenylation is calculated by subtracting 
the actual length of the substrate molecule, and the total amount of polyadenylation is 
calculated by multiplying the average length per molecule by the quantity of substrate in 
the reaction. The rate, computed from the total amount of polyadenylation, is plotted 
on a double reciprocal plot against the concentration of substrate, along with a linear fit 
(K M = 1500 nM, v max =180 pmol/min). Note that while total adenylation displays 
hyperbolic kinetics, tail length is given by adenylation per substrate molecule and is thus 
independent of substrate concentration in the low-concentration regime and inversely 
proportional to substrate concentration in the high-concentration regime. 



3.4. PCR amplification 

Prepare PCR mixes for five reactions, each of 16.7 [A, for each circularized 
template. Use 16.7 fA 5x Phusion HF buffer, 1.7 /A dNTPs 10 mM each, 
0.8 fA library primers 50 fiM each, 58.4 jA water, and 5.0 fA circularized 



132 



Nicholas T. Ingolia 



A 



lOOnt 



•a a 

Ctf CD 

-—i -t-> 

c a 

25r 



S-H 



(U GO 



a 

o 

o .a 



B 










Figure 6.5 Gel purification of reverse transcriptase products. (A) A denaturing 10% 
polyacrylamide gel was loaded with denatured 10 bp DNA ladder and with the reverse 
transcriptase primer as well as six lanes containing reverse transcription reactions. The 
unextended primer band is visible in all reactions at around 100 nt, and the reverse 
transcription products produce a discrete band at 125-130 nt. (B) The gel depicted in 
(A), after the reverse transcription products were excised. 



DNA, followed by 0.8 fA Phusion polymerase. Set up four PCR tube strips 
and make one 16.7 fA aliquot of PCR mix into each tube strip. Perform 
PCR amplification as follows: 30 s at 98 °C; 12 cycles of 10 s at 98 °C, 10 s 
at 60 °C, and 5 s at 72 °C. Remove one strip tube at the end of the 6th, 8th, 
and 10th amplification cycle, leaving the last strip tube in the thermal cycler 
for all 12 cycles. Add 3.4 fA 6x gel loading dye to each reaction. 

Prepare an 8% nondenaturing polyacrylamide gel in 1 X TBE. Prepare a 
sample of 1.0 fA 10 bp ladder with 15.7 fA water and 3.4 fA 6x loading dye. 
Load PCR samples on the gel and run the gel under conditions that 
optimize separation between 90 and 120 bp fragments. When using a 
precast polyacrylamide mini-gel (Invitrogen EC6215BOX), 40 min at 
180 V separates the sequencing library band from other products. Stain 
the gel for 3 min in SYBR Gold and then visualize by UV transillumination. 
A product band should be visible at roughly 120 bp and it should be more 
intense in samples subjected to more rounds of amplification (Fig. 6.6). 
There may also be a product band at 90 bp, corresponding to unextended 
reverse transcription primer. Finally, samples subjected to more rounds of 
PCR amplification may show additional products much larger than 120 bp. 
Excise the 120 bp product band from one or two reactions for each sample. 
Select reactions where amplification has not reached saturation, as judged by 
increasing intensity of the product band, and where higher molecular 
weight products are not prominent. Recover DNA from the excised gel 
slice (see Section 5.4). 
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Figure 6.6 Gel purification of PCR-amplified sequencing libraries. (A) A nondena- 
turing 8% polyacrylamide gel was loaded with sets of PCR reactions from two input 
templates along with a 10-bp DNA ladder. Each template was amplified in four parallel 
reactions with increasing numbers of amplification cycles. The prominent 120 nt 
product is the sequencing library, while the faint 90 nt band represents circularized 
but unextended reverse transcriptase primer. (B) The gel depicted in (A), after the 
sequencing library product bands were excised. 



Resuspend the gel-purified sequencing library in 20.0 jA 10 mM Tris— 
CI, pH 8. Quantify DNA in the library using the Agilent Bio Analyzer DNA 
1000 assay. Typically, the recovered DNA will have a concentration of 
5—25 nM and there will be no significant peak at 90 bp. This gel-purified 
library is suitable for sequencing on the Illumina Genome Analyzer. 




4. Data Analysis 



Deep sequencing of ribosome footprints provides a rich data set for 
studying translation, particularly when it is accompanied by mRNA abun- 
dance measurements. The first step in analyzing footprint sequence data is 
to map the sequencing reads against a reference sequence. Mapping 
sequencing reads may reveal unanticipated regions of translation, such as 
uORFs, as well as translation of annotated protein-coding genes. Once the 
reads have been mapped, expression can be quantified by calculating the 
density of sequencing reads in a specific gene. The significance of variation 
between read density in different experimental conditions can be estimated 
empirically with biological replicates. 
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Figure 6.7 Ambiguities in alignments of sequencing reads due to polyadenylation. 
A sample genomic sequence from the start of the GCN4 gene is shown along with five 
possible footprinting products that differ only in the position of their 3' terminus. 
Polyadenylation adds poly-(A) tails, shown in gray, but sequencing cannot distinguish 
which nucleotides occurred in the original RNA fragment and which were added. 
The middle three sequencing reads are all identical, whereas the 3' termini of the top 
and bottom read can be uniquely distinguished. A table of inferred reference alignment 
lengths Z min and Z max is given for each sequencing read. 



4.1. Mapping polyadenylated sequences 

Polyadenylation of footprint fragments introduces difficulties in mapping a 
sequence to its genomic origin. Ribosome footprint fragments are shorter 
than the 36 nt sequencing read length provided by the Genome Analyzer, so 
some poly-A sequence will be present at the end of most sequencing reads. 
It is not always possible to know how many terminal As in a sequencing read 
were added during polyadenylation and how many were derived from the 
RNA fragment (Fig. 6.7). Furthermore, sequencing errors are more likely 
near the end of the read, making it more difficult to identify the exact extent 
of the terminal poly-A region. One approach that avoids these difficulties is 
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to align a "seed" from the beginning of each sequencing read to the 
reference sequence and then to determine the full extent of the alignment. 
The Bowtie short sequence alignment program is fast, robust, and 
well-suited to performing these seed alignments (Langmead et ah, 2009). 
Furthermore, it reports alignments to targets in the order in which they 
appear in the query file, which facilitates postprocessing of seed alignments. 
Finally, it is capable of reporting multiple alignments as well as reporting 
on degenerate and unaligned query sequences. 

Begin by processing the FastQ format file of sequencing reads to gener- 
ate a FastQ format file of seed sequences comprising the first 23 nt of the full 
read. Use Bowtie to align these seed sequences against a reference database 
consisting of the full yeast genomic sequence. Use the "— v 2" command- 
line option to allow up to two mismatches in the seed alignment. Use "— m 
16" to suppress reporting of seeds with 16 or more matches, which are 
typically degenerate poly-A sequencing reads that can result from unex- 
tended reverse transcription primer, and are uninformative in any case. 
Use "—unfa" and "— maxfa" to generate files containing the unaligned and 
degenerate seed sequences. 

Process the seed alignments by iterating through the input and alignment 
files in parallel. When an input sequence is absent from the alignment, skip it 
and proceed to the next input sequence. When one or more seed align- 
ments are present for an input sequence, extend them find the full reference 
alignments. Use the coordinates of the seed alignment to extract the refer- 
ence region corresponding to the full sequencing read. For each possible 
footprint length /, find the alignment score s(J) by adding the number of 
mismatches between the sequencing read and the genomic alignment over 
nucleotides 1 through / and the number of mismatches between the 
sequencing read and the poly-A linker sequence over nucleotides I + 1 
through L, the full length of the sequencing read. Determine the best 
alignment score, s* corresponding to the fewest mismatches, and find the 
set of fragment lengths £* that give this optimal alignment score. Finally, 
find the minimum and maximum alignment lengths / min and / max . When the 
Bowtie results indicate multiple seed alignments, select the best extended 
reference alignments. In many cases, there will still be multiple, equally 
high-scoring extended alignments. Next, process the file of unaligned seeds 
to extract the full-length unaligned sequencing reads. 

4.2. Reference databases 

A significant fraction of sequencing reads are derived from digested rRNA 
present in the monosome sample. First, align sequences to a database 
consisting of just the processed rRNA transcripts (RDN25-1, RDN18-1, 
RDN58-1, and RDN5-1) and proceed with further alignments using only 
the reads that do not align in the initial rRNA alignment. 
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Alignments can be performed against the full yeast genomic sequence or 
against a collection of yeast protein-coding genes. The choice of reference 
database depends on the purpose of the ribosome profiling experiment. 
Protein-coding sequences may be the most appropriate reference when the 
principal goal of an experiment is gene expression measurements. They will 
include the spliced form of each transcript, so it is not necessary to account 
further for sequencing reads that overlap splice junctions. In preparing a 
protein coding reference database, include 15 nt on each end of the coding 
sequence itself, as the ribosome footprint extends roughly 15 nt to either 
side of the active codon. Alignments against a full genomic reference 
sequence are needed to detect novel sites of translation, including uORFs. 
Footprint sequences that overlap splice junctions substantially will not 
correspond to any chromosomal sequence in a genomic database. In bud- 
ding yeast, where splicing is limited and well-characterized, it is possible to 
correct for the effect of splice junctions on measurements of ribosome 
occupancy. Another option is to augment the genomic reference with a 
collection of splice junction sequences. 

4.3. Selecting high-quality alignments 

Many sequencing reads contain errors, and in some cases these errors 
compromise the mapping of the RNA fragment to the reference sequence. 
In other cases, the fragment is atypically short, presumably because of 
degradation during sample preparation. Noncoding RNA fragments also 
have a length distribution different than that seen for ribosomal footprints. 
Yeast ribosome footprints are typically between 27 and 30 nt, and sequenc- 
ing reads that could not possibly have been derived from an appropriately 
sized fragment should be excluded. However, the ambiguity in fragment 
length introduced by polyadenylation makes it impossible to determine the 
exact length of some fragments. Select sequencing reads that have: (1) two 
or fewer mismatches total in the alignment; (2) a maximum reference 
alignment length of at least 27 nt (/ max > 27) , excluding alignments that 
are demonstrably shorter than the minimum footprint length; and 
(3) similarly, a minimum reference alignment length of no more than 
30 nt (/ mm <30). 

4.4. Quantifying gene expression 

The number of ribosome footprint fragments derived from a gene provides 
a measurement of that gene's expression level, but two normalization factors 
are required to make quantitative comparisons. Ribosomes are believed to 
translate proteins at a similar rate, but this means that a ribosome will spend 
more time translating a longer gene than a shorter one. Because of this 
effect, a longer gene will produce more footprint fragments than a shorter 
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gene expressed at the same level. Thus, the read density on a protein-coding 
gene, expressed in reads per kilobase, provides a directly comparable mea- 
surement of expression. Divide the number of sequencing reads for a gene 
by the size of the gene; if chromosomal alignments are used, correct for any 
splice junctions that would result in a ~ 28 nt window where reads would 
not align to the genomic reference sequence. Although 30 additional 
nucleotides are added to coding sequences when constructing a reference 
database, do not correct the length for these added nucleotides, as the 
number of possible 30 nt fragments in the extended sequence is the same 
as the length of the actual gene. 

Samples will differ in the total number of sequencing reads available and 
the fraction of true footprint fragments, as opposed to rRNA contamination. 
To account for this difference, divide the fragment density by the total 
number of true footprint fragments — sequencing reads that align to the 
reference genome with the appropriate size — to determine reads per kilobase 
per million (rpkM) (Mortazavi et ah, 2008). 

It is particularly important to properly characterize expression measure- 
ment errors and determine the statistical significance of observed changes 
when performing genome-wide measurements. Deep sequencing expres- 
sion measurements are derived from a count of discrete sequencing reads, 
which introduces errors that follow a well-understood statistical model. 
When the absolute number of sequencing reads contributing to a measure- 
ment is small, statistical sampling error will contribute a large variance 
to observed expression ratios between two samples. Use a statistical test 
such as the chi-square test to determine whether the observed distribution 
of sequencing reads between two samples differs significantly from an 
expectation derived from the median ratio of reads for well-expressed 
genes. The statistical variation in expression measurements becomes negli- 
gible when the total number of sequencing reads is large; 128 reads is a good 
threshold for many analyses. Biological variability in samples or in library 
generation predominates under these circumstances, and the coefficient of 
variation remains constant with increasing numbers of sequencing reads. 
Biological replicates prepared independently from matched samples provide 
the best direct assessment of variability. Determine the distribution of log 
ratios of sequencing read counts between replicates for a set of genes 
with similar expression levels. This distribution can be used to assess the 
likelihood of a measured difference between biologically distinct samples. 
Typically, the distribution is roughly normal, though as with many 
biological measurements the tails may contain more highly variable genes 
than expected. Regardless, it can be used to estimate the fraction of genes 
whose fragment count ratio would exceed a given threshold by chance. 
The ratio of the number of genes whose ratio exceeds the threshold in a 
biological experiment to the number between biological replicates estimates 
the false discovery rate at that threshold. 
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Even in the relatively compact yeast genome, there are duplicated and 
degenerate regions where ribosome footprint sequences cannot be uniquely 
mapped to a genomic origin. The simplest approach to deal with this 
problem is to exclude genes with significant nonunique regions, which 
would affect a relatively small fraction of the yeast genome. A more directed 
approach would be to exclude only the degenerate regions, ignoring 
sequencing reads that lie in these areas and correcting the length of the 
coding sequence for the suppressed region. Finally, it would be possible to 
quantify read density in degenerate regions and then partition the read 
density between different chromosomal loci based on the read density in 
adjoining unique regions (Mortazavi et ah, 2008). All of these approaches 
would fail in the case of perfect or near-perfect paralogues such as the 
duplicate histone genes in the yeast genome. These duplicate genes should 
be handled separately, by making a single, combined measurement of 
their expression and, if necessary, specifically quantifying the presence 
of signature variations that distinguish between them. 




5. Solutions and Common Procedures 

5.1. Solutions 

Alkaline fragmentation solution (2x): 2 mM EDTA, 100 mM Na-C0 3 , 
pH 9.2. This solution is prepared by mixing 15 parts 0.1 MNa 2 C0 3 to 
110 parts 0.1 MNaHC0 3 . It will equilibrate with gaseous C0 2 to raise 
the pH over time and thus should be stored in tightly capped, single-use 
aliquots at room temperature. 

Alkaline fragmentation stop /precipitation (540 fil/600 fil): 60 fA 3 M NaOAc 
(pH 5.5), 2.0 fA GlycoBlue 15 mg/ml (Ambion AM9515), 500 fA 
RNase-free water. 

DM4 gel extraction buffer: 300 mM NaCl, 10 mM Tris-Cl (pH 8), 1 mM 
EDTA. 

Polyadenylation enzyme mix: 5.0 fA 2x polyadenylation buffer with ATP, 
4.0 fA RNase-free water, 1.0 fA E. coli poly-(A) polymerase 5 \J/fA 
(New England Biolabs M0276S). The final concentration of enzyme is 
1 U/2 fA. 

Polyadenylation buffer with ATP (2x): 2x poly-(A) polymerase buffer, 2 mM 
ATP, 1 U/fA SUPERase-In. Prepare, for example, 5.0 fA 10 X poly- (A) 
polymerase buffer, 5.0 fA 10 mM ATP, 1.25 fA SUPERase-In, and 
13.8 fA RNase-free water. 

Polysome gradient buffer: 20 mM Tris-Cl (pH 8), 140 mM KC1, 5.0 mM 
MgCl 2 , 100 jUg/ml cycloheximide, 0.5 mM DTT, 20 U/ml 
SUPERase-In. 
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Polysome lysis buffer: 20 mMTris-Cl (pH 8), 140 mMKCl, 1.5 mMMgCl 2 , 

100 /ig/ml cycloheximide, 1% (v/v) Triton X-100. 
RNA gel extraction buffer: 300 mMNaOAc (pH 5.5), 1 mMEDTA, 0.1 U//il 

SUPERase-In. 
Total RNA extraction buffer: 50 mMNaOAc (pH 5.5), 10 mMEDTA. 



5.2. Oligonucleotides 

Reverse transcription primer : 
5'-/5Phos/GATCGTCGGACTGTAGAACTCT- 

GAACCTGTCGGTGGTCGCCGTATCATT/iSpl8/CACTCA/ 

iSpl8/CAAGCAGAAGACGGCATACGATTTTTTTTTTTTTTTT 

TTTTVN 

This primer is 5' phosphorylated, to allow circularization, and contains a 
binding site for the Illumina small RNA sequencing primer as well as one 
Illumina library primer. The flexible linker consists of a 6-nt spacer 
sequence flanked by two Spacer 18 linkers, which should provide a com- 
plete block to polymerases. Following the linker is the second library primer 
sequence and a (dT) 2 o primer anchored to the end of the poly-(A) site by 
degenerate V (A, C, or G) and N (A, C, G, or T) bases. 
Library primers: 

5'-AATGATACGGCGACCACCGA 
5'-CAAGCAGAAGACGGCATACGA 

These correspond to the A and B primers present on the Illumina single- 
read flowcell. 
Sequencing primer: 
S'-CGACAGGTTCAGAGTTCTACAGTCCGACGATC 

This corresponds to the Illumina small RNA sequencing primer. 
Control oligonucleotide (RNA): 
5'-AUGUACACGGAGUCGACCCGCAACGCGA 

This 28 nt RNA oligonucleotide is used as a standard for size selection 
during library generation. It is an arbitrary sequence which does not resem- 
ble any sequence in the yeast genome. 



5.3. Nucleic acid precipitation 

Perform precipitation of nucleic acids in nonstick tubes, which promote the 
formation of a tight pellet at the bottom of the tube. Add NaOAc (pH 5.5), 
for RNA, or NaCl, for DNA, to raise the concentration of Na + to 300 mM. 
Where noted, the protocol includes steps that achieve this salt concentration 
and no further salt is needed. Add a coprecipitant such as 2.0 fA GlycoBlue 
whenever precipitating small quantities of nucleic acids. GlycoBlue in 
particular creates visible pellets but interferes with spectrophotometic 
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quantitation. Add at least 1 volume isopropanol and incubate at least 30 min 
at — 20 °C. Pellet the sample by spinning 30 min at full speed in a micro fuge 
at 4 °C. Remove as much of the supernatant as possible. Pulse spin the tube 
to collect the residual liquid at the bottom of the tube, then use a very thin 
pipette tip such as a 10- jA tip or a gel loading tip to remove as much liquid as 
possible. Air dry the pellet by leaving the tube open on its side for 5—10 min. 

5.4. Nucleic acid gel extraction 

Pierce a 0.5-ml microfuge tube with a 20-gauge needle. Nest it inside a 
1.5-ml nonstick microfuge tube and label the side of the tube, as the lid may 
snap off during centrifugation. Place the gel slice in the inner tube and spin 
the tubes 2 min at full speed in a tabletop microfuge to extrude the gel 
through the needle-hole into the outer tube. Remove the inner tube, invert 
it over the outer tube, and tap it to collect any remaining gel debris. Add 
400 jA RNA or DNA gel extraction buffer to the gel and incubate over- 
night at 4 °C with gentle agitation. Cut the tip off of a 1000-jA pipette 
tip and collect the buffer and gel debris. Load it onto a Spin-X cellulose 
acetate filter column (Corning 8162) and centrifuge 1 min at full speed in a 
microfuge. Collect the flow- through and precipitate the recovered nucleic 
acid (see Section 5.3). Note that both gel extraction buffers already contain 
300 mM salt, but add a coprecipitant, as low amounts of nucleic acid are 
recovered in the gel extractions. 
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Abstract 

A genetic interaction occurs when the combination of two mutations leads to an 
unexpected phenotype. Screens for synthetic genetic interactions have been used 
extensively to identify genes whose products are functionally related. In particular, 
synthetic lethal genetic interactions often identify genes that buffer one another or 
impinge on the same essential pathway. For the yeast Saccharomyces cerevisiae, 
we developed a method termed synthetic genetic array (SGA) analysis, which 
offers an efficient approach for the systematic construction of double mutants 
and enables a global analysis of synthetic genetic interactions. In a typical SGA 
screen, a query mutation is crossed to an ordered array of ~5000 viable gene 
deletion mutants (representing ~ 80% of all yeast genes) such that meiotic prog- 
eny harboring both mutations can be scored for fitness defects. This approach can 
be extended to all ~6000 genes through the use of yeast arrays containing 
mutants carrying conditional or hypomorphic alleles of essential genes. Estimating 
the fitness for the two single mutants and their corresponding double mutant 
enables a quantitative measurement of genetic interactions, distinguishing nega- 
tive (synthetic lethal) and positive (within pathway and suppression) interactions. 
The profile of genetic interactions represents a rich phenotypic signature for each 
gene and clustering genetic interaction profiles group genes into functionally 
relevant pathways and complexes. This array-based approach automates yeast 
genetic analysis in general and can be easily adapted for a number of different 
genetic screens or combined with high-content screening systems to quantify the 
activity of specific reporters in genome-wide sets of single or more complex 
multiple mutant backgrounds. Comparison of genetic and chemical-genetic inter- 
action profiles offers the potential to link bioactive compounds to their targets. 
Finally, we also developed an SGA system for the fission yeast Schizosaccharo- 
mycespombe, providing another model system for comparative analysis of genetic 
networks and testing the conservation of genetic networks over millions of years of 
evolution. 




1. Introduction 

A genetic interaction refers to an unexpected phenotype not easily 
explained by combining the effects of individual genetic variants (Bateson 
et ah, 1905). Importantly, genetic interactions organize into complex net- 
works that may underlie the relationship between an organism's genotype 
and its phenotype (Waddington, 1957). Thus, an unbiased, systematic 
analysis of these networks and the interactions that comprise them is 
required in order to understand the genetic basis underlying disease. 
Given the complexity of the human genome (Levy et al, 2007), determin- 
ing how different alleles and polymorphisms combine to manifest a pheno- 
type is a daunting task. Researchers have, therefore, embraced inbred model 
systems as well as isogenic populations of cultured cells derived from fruit 
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flies and mammals, as platforms to map genetic interactions in a systematic, 
unbiased, and comprehensive fashion (Dixon et ah, 2009). 

Large-scale genetic interaction mapping studies were pioneered in 
Saccharomyces cerevisae and focused on the identification of a specific type 
of interaction termed synthetic lethality (Tong et ah, 2001, 2004). Synthetic 
lethal or sick interactions, in which a combination of mutations in two 
genes results in cell death or reduced fitness, respectively, has been used 
extensively in different model organisms to identify genes whose products 
buffer one another and impinge on the same essential biological process (Fay 
et ah, 2002; Hartman et ah, 2001; Lucchesi, 1968). Genome-wide synthetic 
lethal analysis in yeast revealed the importance of systematic genetic inter- 
action maps for assessing the biological roles of genes in vivo and uncovering 
new components of specific pathways (Pan et ah, 2006; Tong et ah, 2004). 
Recently, genetic mapping technologies combined with quantitative 
phenotypic analyses have enabled further dissection of genetic interactions 
into different types and classes offering the potential to define protein 
complex membership and infer order of gene function within specific 
biochemical pathways (Collins et ah, 2006; St Onge et ah, 2007). 

The first enabling reagent set for large-scale genetic interaction screens 
in budding yeast was derived from the deletion mutant project, in which 
each known or suspected open reading frame is deleted and replaced with 
the dominant drug-resistance marker, kanMX (Giaever et ah, 2002; 
Winzeler et ah, 1999). The international consortium responsible for this 
landmark analysis identified ~ 1000 essential genes and constructed ~5000 
viable haploid deletion mutants. The introduction of molecular tags or 
barcodes, a unique 20-bp DNA sequence at either end of the deletion 
cassette, acts as a unique mutant strain identifier enabling the fitness of a 
particular mutant to be assessed within a population using a barcode micro- 
array (Giaever et ah, 1999). Additional libraries have subsequently been 
developed in which each of the ~ 1000 essential genes are altered in such a 
way as to produce either conditional alleles (Ben-Aroya et ah, 2008; 
Mnaimneh et ah, 2004) or hypomorphic (partially functional) alleles com- 
patible with viability (Schuldiner et ah, 2005). Combining the viable dele- 
tion mutant and essential gene mutant collections provides the first 
opportunity for systematic genetic analysis in yeast and the potential for 
examining the complete genome-wide set of ~ 18 million different double 
mutants for synthetic genetic interactions. 

Synthetic genetic array (SGA) analysis enables the systematic construc- 
tion of double mutants (Tong et ah, 2001), allowing large-scale mapping of 
synthetic genetic interactions (Costanzo et ah, submitted for publication; 
Tong et ah, 2004). A typical SGA screen involves crossing a "query" strain 
to the array of ~5000 viable deletion mutants, but the array may also 
include essential gene mutants, and through a series of replica-pinning 
procedures, the double mutants are selected and scored for growth. 
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Applying SGA analysis on a genome-wide scale to ~ 1700 query mutations 
has enabled us to generate a genetic interaction network containing 
~ 170,000 interactions, with functional information associated with the 
position and connectivity of a gene on the network (Costanzo et al, 
submitted for publication). 

The SGA methodology is versatile because any genetic element (or any 
number of genetic elements) linked to a selectable marker(s) can be manipu- 
lated similarly. In this regard, SGA methodology automates yeast genetics 
generally, such that specific alleles of genes, including point mutants and 
temperature-sensitive alleles, or plasmids can be crossed into any ordered 
array of strains providing systematic approaches to genetic suppression 
analysis, dosage lethality, dosage suppression or plasmid or reporter shuf- 
fling. In this chapter, we describe the steps of SGA analysis in detail and 
we hope to encourage laboratories from a broad spectrum of fields to adopt 
this methodology to suit their specific interests. 




2. Methodology 

SGA analysis first requires a relatively simple set up, generating the 
query strains and the array strains, and then the procedure itself basically 
involves several replica-plating steps, which are amenable to either manual 
or robotic manipulation. Each step of the procedure is described in detail 
below. 

SGA query strain construction 

Pin tool sterilization procedures 

Constructing a 1536-density deletion mutant array (DMA) 

SGA procedure 

Double mutant array image acquisition and processing 

Quantitative scoring of genetic interactions using colony size-based 

fitness measurements 

Interpretation and analysis of genetic interactions 



2.1. SGA query strain construction 

As mentioned above, SGA enables high-throughput construction and iso- 
lation of haploid double mutants by mating a "query" strain of interest 
harboring SGA reporters to an ordered array of mutant strains. This section 
describes different approaches for constructing SGA query strains. 
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2.1.1. Nonessential query strain: PCR-mediated gene deletion 

1 . Synthesize two gene-deletion primers, each containing 55 bp of sequence 
at the 5' end that is specific to the region upstream or downstream of the 
gene of interest (GeneX), excluding the start and stop codons and 22 bp of 
sequence at the 3' end that is specific for the amplification of the natMX4 
cassette (Goldstein and McCusker, 1999). The MX4 cassette amplifica- 
tion sequences include the forward amplification primer (5'-ACATG- 
GAGGCCCAGAATACCCT-3 7 ) and the reverse amplification primer 
(5'-CAGTATAGCGACCAGCATTCAC-3') . 

2. Set up a 100 fil PCR reaction (68.7 fil H 2 0, 10 fA 10 x PCR buffer, 2 /il 
10 mMdNTPs, 2 fA 50 fiM forward primer, 2 /A 50 /iM reverse primer, 
-0.1 fig p4339 DNA template in 10 /il, 5 fA DMSO, 0.3 jA 5 U/jA Taq 
polymerase) to amplify the natMX4 cassette flanked with 55-bp target 
sequences from p4339 (pCRII-T 'OP 0::natMX4) with the gene- 
deletion primers designed in step 1. Plasmid p4339 serves as a DNA 
template to amplify the natMX4 cassette. 

3. Cycle as follows in a thermocycler with a heated lid: 95 °C 5 min, 95 °C 
30 s, 55 °C 30 s, 68 °C 2 min, repeat 30 times, 68 °C 10 min, hold at 
4 °C. PCR products can be stored at —20 °C. 

4. Transform the PCR product into the SGA starting strain, Y7092 
(MAToc canlA::STE2pr-Sp_his5 lyplA ura3A0 leu2A0 his3Al metl5A0) 
using standard procedures (Winzeler et ah, 1999). Y7092 harbors repor- 
ters and markers necessary for SGA haploid strain selection following 
meiotic recombination. In particular, the M^4Ta-specific reporter 
[STE2pr-Sp_his5, composed of the Saccharomyces cerevisiae STE2 pro- 
moter driving the Schizosaccharomyces pombe his5 gene, which comple- 
ments 5. cerevisiae his3A 1 (Tong and Boone, 2006)] was integrated at the 
CAN1 locus. Loss of CAN1 confers canavanine resistance. Y7092 also 
carries a lypl marker, which confers resistance to thialysine. The canl and 
lypl mutations serve as recessive counterselectable markers designed 
to remove unwanted heterozygous diploids from the population. Select 
transformants on YEPD + clonNAT medium (see Section 3). 



2.1.2. Nonessential query strain: Gene deletion marker 
switch method 

1. Obtain the deletion strain of interest (xxxA: :kanMX4) from the MATa. 
deletion collection (OpenBiosystems), mate with Y8205 (MAToc 
canlA::STE2pr-Sp_his5 lyplA::STE3pr-LEU2 ura3A0 \eu2A0 his3At) 
and isolate diploid zygotes by micromanipulation. Transform the result- 
ing diploid with EcoRI-digested p4339 (see above) using standard yeast 
transformation protocols (Winzeler etal. , 1999). This procedure switches 
the gene deletion marker from kanMX4 to natMX4. Select transformants 
on YEPD + clonNAT medium (see Section 3). 
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2. Transfer the resultant diploids to enriched sporulation medium (see 
Section 3) and incubate at 22 °C for 5 days. 

3. A AMToc-specific reporter (STE3pr-LEU2) was integrated at the LYP1 
locus in Y8205 to provide a convenient method for selecting 
MATcl meiotic progeny. Resuspend a small amount of spores in 
sterile water and grow on SD — Leu/Lys/Arg + canavanine/thialysine 
(see Section 3) to select MATot meiotic progeny. Incubate at 30 °C for 
~2 days. To facilitate the selection of MATol meiotic progeny we aim to 
plate ~ 200— 300 colonies. 

4. Replica plate to YEPD + clonNAT (see Section 3) to identify the 
MATa meiotic progeny that carry the query deletion marked with 
natMX4 (xxxA::natMX4). 



2.2. Pin tool sterilization procedures 

The SGA procedure involves sequential transfer of yeast colonies onto 
different selective media in order to isolate haploid double mutant strains. 
Yeast colony transfer is accomplished manually using hand-held pin tools or 
automatically using robotically controlled pin tools. The following section 
describes pin tool sterilization procedures for both manual and robotic 
devices. 



2.2.1. Manual pin tools (for low throughput: ~l-2 genome-wide 
screens/month) 

The following manual pin tools can be purchased from V&P Scientific, Inc. 
(San Diego, CA): 96 floating pin E-clip style manual replicator (VP408FH), 
384 floating pin E-clip style manual replicator (VP384F), Registration 
accessories: Library Copier (VP381), Colony Copier (VP380), Pin 
cleaning accessories: plastic bleach or water reservoirs (VP421), pyrex 
alcohol reservoir with lid (VP420), pin cleaning brush (VP425). 

1. Set up the wash reservoirs as follows: three trays of sterile water of 
increasing volume — 30, 50, and 70 ml, one tray of 40 ml of 10% bleach, 
one tray of 90 ml of 95% ethanol. To ensure that pins are cleaned 
properly and to avoid contamination in the wash procedure, the volume 
of wash liquids in the cleaning reservoirs is calculated to cover the pins 
sequentially in small increments. For example, only the tips of the pins 
should be submerged in water in the first step. As the pins are transferred 
to subsequent cleaning reservoirs and the final ethanol step, the lower 
halves of the pins should be covered. 

2. Let the replicator sit in the 30 ml- water reservoir for ~ 1 min to remove 
the cells on the pins. 

3. Place the replicator in 10% bleach for ~20 s. 
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4. Transfer the replicator to the 50 ml-water reservoir and then to the 
70 ml-water reservoir to rinse the bleach off the pins. 

5. Transfer the replicator to 95% ethanol. 

6. Let excess ethanol drip off the pins, then flame. 

7. Allow replicator to cool. 

8. The manual replicator tools are compatible with OmniTray (Nunc) 
petri dishes. We found that ~35 ml of media in OmniTrays yield 
optimal results. 



2.2.2. Singer RoToR bench top robot (for high-throughput 
SGA analysis) 

The Singer RoToR can be purchased from Singer Instruments (Somerset, 
UK; www.singerinst.co.uk). This system uses disposable plastic replicators 
(RePads, Singer Instruments, UK) and thus does not require any steriliza- 
tion procedures, making it simple and rapid to use. The Singer RoToR 
HDA bench top robot also uses PlusPlate (Singer Instruments) petri dishes 
that have a larger surface area but the same external footprint dimensions as 
OmniTray (Nunc). These larger plates facilitate replica-plating with the 
disposable RePads. We found that ~ 50 ml of media in PlusPlates yield 
optimal results. 

2.2.3. BioMatrix robot (for ultrahigh-throughput SGA analysis) 

The BioMatrix Colony Arrayer Robot can be purchased from S&P 
Robotics, Inc. (Toronto, ON; www.sprobotics.com). Use the following 
procedure to clean and sterilize the replicator pins prior to use of the robot: 

1. Fill the sonicator bath with 390 ml sterile distilled water. 

2. Clean the replicator pins in the sonicator for 5 min. 

3. Remove the water and fill the sonicator bath with 390 ml 70% ethanol. 

4. Sterilize the replicator in the sonicator for 20 s per cycle, repeat the 
cycle twice. 

5. Let the replicator sit in a tray of 100 ml of 95% ethanol for 5 s. 

6. Allow the replicator to dry over the fan for 20 s. 

Use the following procedure to sterilize the pins at the end of each 
replica pinning step: 

1. Set up the wash reservoirs as follows: Program water bath to automati- 
cally fill with sterile distilled water from bottle supply source, manually 
fill brush station with 320 ml sterile distilled water, fill sonicator with 
390 ml 70% ethanol and basin with 100 ml 95% ethanol. 

2. Let the replicator sit in the water bath for 10 s per cycle, repeat the cycle 
for four additional times to remove residual cells from replicator pins. 

3. Clean replicator pins further at the brush station for three cycles. 
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4. Sterilize the replicator in the 70% ethanol-sonicator bath for 20 s per 
cycle, repeat twice. 

5. Let the replicator sit in the 95% ethanol reservoir for 5 s. 

6. Allow the replicator to dry over the fan for 20 s. 

7. The BioMatrix robot can be used in conjunction with OmniTray 
(Nunc) petri dishes for the replica pinning steps involved in SGA 
analysis. 

2.3. Constructing a 1536-density DMA 

The collection of MATa. deletion strains is available in stamped 9 6- well agar 
or frozen stocks in 96-well plates from various sources (Invitrogen, American 
Type Culture Collection, EUROSCARF, and Open Biosystems). How- 
ever, deletion strains are arrayed at a higher density for SGA analysis. 
Specifically, each SGA array plate consists of 384 mutant strains arrayed in 
quadruplicate resulting in an array density of 1536 yeast colonies/plate. The 
following section describes how to assemble high-density SGA arrays from 
low-density (96 colonies/plate) source arrays available from different sup- 
pliers. The procedure described here employs a BioMatrix Colony Arrayer 
Robot but high-density DMAs can also be assembled using manual pin 
tools or the Singer RoToR instrument. 

1. Peel off the foil coverings slowly on the frozen 96-well microtiter 
plates. 

2. Let the plates thaw completely on a flat surface. 

3. Mix the glycerol stocks gently by stirring with a 96-pin replicator. 

4. Replicate the glycerol stocks from the 96-well plates onto YEPD + 
G418 agar plates using the Library Copier with the pair of one- 
alignment holes on the front frame. Take extreme caution that the pins 
do not drip liquid into neighboring wells. 

5. Reseal the 96-well plates with fresh aluminum sealing tape and return 
to -80 °C. 

6. Let cells grow at room temperature for ~2 days. 

7. Condense four plates of 96-format into one plate of 384-format using 
the BioMatrix Colony Arrayer Robot 96-pin replicator and the 
accompanying BioMatrix replicator software. 

8. Incubate 384-density arrays at room temperature for ~2 days. 

9. Replicate each 384-density DMA in quadruplicate onto a single plate 
using the BioMatrix robot 384-pin replicator and accompanying 
BioMatrix replicator software. This will generate a DMA consisting 
of 384 mutant strains and 1536 yeast colonies. 

10. Incubate at room temperature for ~2 days, to generate a 1536-density 
DMA working copy. 
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2.4. SGA procedure 

The following section provides a detailed description of the six steps 
involved in SGA-mediated double mutant isolation. The media required 
for various selections are described in detail in Section 3. A schematic 
illustrating the SGA procedure is shown in Fig. 7.1. 

2.4.1. Query strain and DMA 

1. Grow the query strain in a 5 ml YEPD overnight culture. 

2. Pour the query strain culture onto a YEPD plate, use the replicator to 
transfer the liquid culture onto two fresh YEPD plates, generating a 
source of newly grown query cells for mating to the DMA in the density 
of 1536 colonies. Pinning the query strain in a 1536-format on an agar 
plate is advantageous as cells are evenly transferred to subsequent mating 
steps. One query plate should contain a sufficient amount of cells for 
mating with eight plates of the DMA. Allow cells to grow at 30 °C for 
2 days. 

3. Replicate the DMA to fresh YEPD + G418 media. Allow cells to grow 
at 30 °C for 1 day. The DMA can be reused for three to four rounds of 
mating reactions. 



2.4.2. Mating the query strain with the DMA 

1. Pin the 1536-format query strain onto a fresh YEPD plate. 

2. Pin the DMA on top of the query cells. 

3. Incubate the mating plates at room temperature for 1 day. 



2.4.3. MATa/a diploid selection and sporulation 

1. Pin the resulting MATa/ot zygotes onto YEPD + G418/clonNAT 
plates. 

2. Incubate the diploid selection plates at 30 °C for 2 days. 

3. Pin diploid cells onto enriched sporulation medium. 

4. Incubate the sporulation plates at 22 °C for 5 days. 



2.4.4. MATa meiotic progeny selection 

1. Pin spores onto SD — His/Arg/Lys + canavanine/thialysine plates. 

2. Incubate the haploid selection plates at 30 °C for 2 days. 
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Figure 7.1 Synthetic genetic array (SGA) methodology. (A) A MATa strain carries a 
query mutation linked to a dominant selectable marker (filled black circle), such as the 
nourseothricin-resistance marker, natMX4, and the SGA reporter, canlA::STE2pr- 
Sp_his5 (in which STE2pr-Sp_his5 is integrated into the genome such that it deletes 
the open reading frame (ORF) of the CAN1 gene, which normally confers sensitivity 
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2.4.5. MATa-kanMXb meiotic progeny selection 

1. Pin the MATa meiotic progeny onto (SD/MSG) — His/Arg/Lys + 
canavanine/thialysine/G41 8 plates. 

2. Incubate the fe^mMX^-selection plates at 30 °C for 2 days. 



2.4.6. MATa-kanMX4-natMX4 meiotic progeny selection 

1. Pin the MATa. meiotic progeny onto (SD/MSG) — His/Arg/Lys + 
canavanine/thialysine/G41 8/clonNAT plates. 

2. Incubate the kanMX4/ natMX4 selection plates at 30 °C for 1—2 days. 

3. Image double mutant array plates and score for fitness defect. The 
barcode microarrays can be used as an alternative method to score 
double mutants for fitness defects. Since each of the deletion mutants 
is tagged with two unique oligonucleotide barcodes, their growth rates 
can also be monitored within a population of cells (Decourty et ah, 2008; 
Varietal, 2004). 



2.5. Double mutant array image acquisition and processing 

Initial SGA screens focused on detecting severe synthetic sick or synthetic 
lethal interactions via visual inspection of double mutant colonies and 
comparison to wild-type controls (Tong et al, 2001, 2004). However, 
quantitative measurement of genetic interactions offers the potential for 
constructing high-resolution genetic networks (Collins et ah, 2006; St Onge 
et ah, 2007). We developed an automated computational pipeline for 
acquiring and processing yeast colony data to extract precise fitness and 
genetic interaction measurements from yeast double mutant colony size 
(Fig. 7.2) (Baryshnikova et ah, manuscript in preparation; Tong et ah, 2004). 



to canavanine). The query strain also lacks the LYP1 gene. Deletion of LYP1 confers 
resistance to thialysine. This query strain is crossed to an ordered array of MATa 
deletion mutants (xxxA). In each of these deletion strains, a single gene is disrupted 
by the insertion of a dominant selectable marker, such as the kanamycin-resistance 
(kanMX4) module (the disrupted gene is represented as a filled red circle). (B) The 
resulting heterozygous diploids are transferred to a medium with reduced carbon and 
nitrogen to induce sporulation and form haploid meiotic spore progeny. (C) Spores are 
transferred to a synthetic medium that lacks histidine, which allows selective germina- 
tion of MATa meiotic progeny owing to the expression of the SGA reporter, canlA:: 
STE2pr-Sp_his5 '. To improve this selection, canavanine and thialysine, which select 
canlA and lyplA while killing CAN1 and LYP1 cells, respectively, are included in the 
selection medium. (D) The MATa meiotic progeny are transferred to a medium that 
contains kanamycin which selects single mutants equivalent to the original array 
mutants and double mutants. (E, F) An array of double mutants is selected on a medium 
that contains both nourseothricin and kanamycin. 
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Figure 7.2 Computational pipeline for processing SGA data. (A) Double mutant array plates are photographed by a high-resolution digital 
camera. (B, C) Digital images of the double mutant array plates are processed by a custom-developed image processing software that identifies 
the colonies and measures their areas in terms of pixels. (D) Quantified double mutant colony sizes are stored in the database for further 
manipulation and analysis. (E) To identify quantitative genetic interactions, the yeast colony data is retrieved from the database and a series of 
normalizations are applied to correct for numerous systematic experimental effects. (F) Genetic interactions are measured by combining the 
corrected double mutant fitness and the fitnesses of the two single mutants. (G) Genetic interaction data is made available via the DRYGIN 
web database system. 
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Double mutant array plates are photographed in a light controlled 
environment using a high-resolution digital imaging system developed by 
S&P Robotics, Inc. (Toronto, ON). Digital images are processed using 
custom-developed image-processing software that measures colony area in 
terms of pixels (Tong et ah, 2004). Colony sizes are then stored in a 
PostgreSQL database for further manipulation and analysis. Thorough 
quality control procedures are applied to the data to ensure correct plate 
identity, screen quality, and proper image processing. 

2.6. Quantitative scoring of genetic interactions using colony 
size-based fitness measurements 

Quantitative genetic interactions define double mutant combinations that 
deviate from an expected phenotype (Bateson et ah, 1905). Mutations in 
independent genes often combine in a multiplicative manner and the 
resulting double mutant phenotype should be equivalent to the product of 
the two individual mutations (Mani et ah, 2008). The extent of the genetic 
interaction is consequently measured as 8 —f a b —fa'fb, where^, fy, zn&f ab 
are quantitative fitness measures of the two single and the double mutant, 
respectively (Elena and Lenski, 1997). Negative genetic interactions (e < 0) 
refer to double mutants showing a more severe fitness defect than expected, 
with the extreme case being synthetic lethality. Positive genetic interactions 
(e > 0) refer to double mutants with a less severe fitness defect than 
expected and include interactions such as epistasis and suppression (Collins 
et ah, 2006; Mani et ah, 2008; Segre et ah, 2005). 

To measure positive and negative genetic interactions quantitatively, 
precise estimates of both single and double mutant fitness are required. One 
method for measuring genetic interactions from yeast colony data derives 
single mutant fitness estimates from the average of all double mutants 
carrying the same query or array mutation (Collins et ah, 2006). Double 
mutant colony sizes are subsequently normalized by single mutant fitness 
and experimental variance estimates resulting in a single quantitative mea- 
sure, termed S-score, which reflects both the strength and confidence of the 
genetic interaction (Collins et ah, 2006). 

Although quantitative, the S-score does not represent a true fitness- 
based measure of genetic interaction. Furthermore, this approach does not 
account for several experimental effects associated with SGA technology 
that introduce strong systematic biases in large-scale genetic interaction 
datasets and are particularly pronounced for high-density arrays. These 
experimental artifacts adversely affect genetic interaction measurements 
resulting in increased false-positive rates and reduced sensitivity. Thus, we 
developed a novel method to process raw yeast colony sizes and derive true 
measures of fitness and genetic interaction (Baryshnikova et ah, manuscript 
in preparation). 
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To do so, we first identified several systematic experimental factors that 
contribute the vast majority of the observed variance in colony size and 
seriously interfered with our ability to detect and measure genetic interac- 
tions. Appropriate normalization of these experimental effects is critical to 
ensure accurate and reliable genetic interaction measurements. The system- 
atic effects and normalization methods are summarized below and described 
in detail elsewhere (Baryshnikova et ah, manuscript in preparation). 

Systematic effects and normalization procedures: 

1. Plate-specific effect correction: normalizes differences in growth between 
plates due to varying incubation times and query single mutant fitness. 

2. Row /column effect correction: normalizes differences in colony growth 
caused by differential exposure to nutrients due to plate position. 

3. Spatial effect correction: normalizes differences in colony growth due to 
gradients in media thickness and/or relative proximity to heat sources. 

4. Competition effect correction: normalizes differences in colony growth due 
to reduced fitness of neighboring mutant strains that results in reduced 
competition for nutrients. 

5. Batch effect normalization: corrects for the striking similarity between 
genome-wide SGA screens conducted in parallel. We find that the set 
of screens completed by the same person, on the same robot and at the 
same time tend to share a common nonbiological signature. In other 
words, these screens share a set of unusually small or large colonies that 
could be confused with true genetic interactions, unless considered in 
the context of other screens performed in the same period of time. 

To estimate double mutant fitness from colony size measurements we 
developed an approach that models colony size as a multiplicative combi- 
nation of double mutant fitness, systematic experimental factors, and ran- 
dom noise. Specifically, for a double mutant deleted for genes a and b, we 
modeled colony size as C a y —fb • s a b ' e > where C a y is the colony area,J^ is 
the double mutant fitness, s a y is the combination of all systematic factors, and 
e is log-normally distributed error. In addition to double mutant fitness, we 
also obtained accurate single mutant fitness estimates (f a and^) by averaging 
colony sizes for a given mutant across numerous control experiments using 
different arrays consisting of kanMX4- and n^MX4-marked deletion 
mutants whose array positions have been randomized. Genetic interactions 
are subsequently measured by assuming the multiplicative model for inde- 
pendent genes, that is, f a y =f a 'fb~\~ ^ab, where £ a y represents the genetic 
interaction term between genes a and b (Baryshnikova et al. , manuscript in 
preparation) . 

Along with correcting for systematic effects and estimating the genetic 
interaction factor, we derived an accurate estimate of variance for the 
interaction. Our variance estimate is reported as a p-value that reflects 
both the local variability of replicate colonies (based on 4 colonies/plate 
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for each double mutant) and the variability of double mutants sharing the 
same query or array mutation. 

Accounting for systematic effects and estimating genetic interactions and 
confidence levels separately provide a boost in data quality and capacity for 
predicting gene function from the resulting genetic interactions (Fig. 7.3). For 
example, functionally related genes are known to be enriched for genetic 
interactions and to share similar genetic interaction patterns. Using Gene 
Ontology coannotation as a measure of functional relatedness, genetic inter- 
actions measured using our newly developed score (SGA score) can predict 1 00 
functionally related gene pairs with 62% precision(Fig. 7.3B, i). Failing to 
account for systematic experimental effects results in a significant reduction 
in precision (Fig. 7.3B, i). Analogously, similarity of genetic interaction pro- 
files, as measured by Pearson correlation coefficients, computed using SGA 
scores predicts 100 functionally related gene pairs with 90% precision com- 
pared to only 15% precision when normalization procedures are not applied 
(Fig. 7.3B, ii). 

Access to genetic interaction scores and confidence values is provided via a 
web database system (DRYGIN) which facilitates retrieval of interactions and 
their analysis (Koh et al., 2010). 

2.7. Interpretation and analysis of genetic interactions 

Existing genetic interaction datasets in yeast have provided significant insight 
into the general principles of genetic network connectivity. For example, 
genes with related biological functions are connected by genetic interactions 
more often than expected by chance (Tong etal., 2004) (Fig. 7.4). The position 
and the connectivity of the gene on the genetic interaction network is predic- 
tive of function. Moreover, because the genetic network is a small world 
network (Tong et al, 2004), genes within the same neighborhood of the 
network tend to interact with one another and thus a sparsely mapped network 
is predictive of genetic interactions (Fig. 7.4B). Furthermore, synthetic lethal 
(negative) genetic interactions among nonessential genes generally do not 
correspond to physical interactions between the corresponding gene products. 
It has been observed that negative genetic interactions are more frequent 
between genes lying in different pathways, whereas physical interactions are 
more frequent among gene products functioning within the same pathway 
(Bader et ah, 2004; Collins et al, 2007; Kelley and Ideker, 2005; Tong et al, 
2004; Ye et al, 2005) (Fig. 7.4C). However, when a pathway or complex 
contain at least one essential gene, they are often enriched for so-called within- 
pathway synthetic lethal interactions, indicating that a subset of the negative 
genetic interactions for essential genes overlap with protein— protein interac- 
tions (Bandyopadhyay et al, 2008; Boone et al, 2007). Positive genetic inter- 
actions can connect members of the same protein complex or pathway. These 
positive within-pathway interactions may reflect that both the single and 
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Figure 7.3 Systematic effect correction. (A) A schematic of a typical double mutant 
array plate shows the systematic biases affecting colony sizes, (i, ii) A typical double 
mutant array plate contains control spots (gray circles), strains with low fitness (blue 
circles), negative interactions (red circles), and positive interactions (not shown). On 
visual inspection, all three cases appear as small colonies or empty spots, (iii) Quantifi- 
cation of colony areas shows a distinctive spatial pattern affecting opposite sides of the 
plate (bigger colonies on the right, smaller colonies on the left) that was not obvious on 
visual inspection. Failure to correct for this spatial pattern will result in false-positive 
interactions, (iv) Corrects spatial patterns, eliminates false positives, and highlights true 
genetic interactions. (B) Precision-recall curves on genetic interaction scores (i) and 
genetic profile similarity (ii) show the increased functional prediction capacity of 
genetic data after correcting for systematic biases. A set of 1712 genome-wide SGA 
screens (Costanzo et ah, in press) were processed using the SGA score (Baryshnikova 
et ah, manuscript in preparation) and a version of the SGA score without systematic 
effect correction. Both direct genetic interactions and genetic profile similarities, as 
measured by Pearson correlations, were assessed for function by calculating precision 
and recall of functionally related gene pairs as described in the study of Myers et al. 
(2006). As a measure of functional relatedness, we used coannotation to the same Gene 
Ontology term. 
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double mutants of nonessential linear pathways have the same fitness defect and 
therefore do not show the expected fitness defect associated with the multipli- 
cative model. However, a more recent analysis of the global yeast physical 
interaction network as defined by affinity purification-mass spectrometry, yeast 
two-hybrid protocol, or protein-fragment complementation assay (PCA), 
showed that roughly an equivalent number of physical interactions overlap 
with negative and positive genetic interaction pairs: ~7% of protein-protein 
interacting pairs shared a negative genetic interaction, whereas ~5% shared a 
positive interaction (Costanzo et al, in press). Conversely, only a small fraction 
of gene pairs that show a genetic interaction (0.4% negative and 0.5% positive) 
are also physically linked (Costanzo et al, in press). These findings therefore 
suggest that the vast majority of both positive and negative interactions occurs 
between, rather than within, complexes and pathways, connecting those that 
presumably work together or buffer one another, respectively. 

The synthetic lethal or negative genetic interaction profile for a particular 
query gene provides a rich phenotypic signature reflecting the function of the 
query as it contains genes involved in pathways that buffer the query. On 
average, negative interactions are approximately twofold more prevalent than 
positive interactions (Costanzo et al, in press). However, we found subsets of 
genes that showed a strong bias in interaction type (up to 8— 16-fold more 
negative than positive interactions or vice versa) (Costanzo et al, in press). 
While the genetic interaction profile composed of negative genetic interac- 
tions carries more functional information than that composed only of positive 
interactions, the profile composed of both positive and negative interactions is 
most informative about gene function (Fig. 7.4). Clustering of genes accord- 
ing to their genetic interaction profiles is a simple yet very powerful tool for 
gene function prediction via a "guilt-by-association" approach (Figs. 7.4— 
7.5). Open source clustering software, such as Cluster 3.0 (http://bonsai.ims. 
u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm), reassort rows 



Figure 7.4 Properties of genetic interactions. (A) Example of a yeast synthetic lethal 
network. The synthetic lethal network is a sparse network, indicating that genetic 
interactions are rare. The frequency of true synthetic lethal interactions (blue lines) is 
less than 1%. A detailed description of how this initial network was generated can be 
found elsewhere (Tong et ah, 2001). (B) Functional neighborhood corresponding to 
indicated region (dashed gray circle) in (A). Despite being rare, synthetic lethal inter- 
actions (blue lines) occur frequently among genes that are functionally related, such as 
those involved in DNA replication and repair shown here. The frequency of synthetic 
lethal interaction between functionally related genes ranges from 18% to 25%. 

(C) Orthogonal relationships. Negative interactions tend to occur between nonessen- 
tial complexes and pathways. Positive interactions overlap significantly with physical 
interactions and tend to connect members of the same pathway or complex. 

(D) Grouping genes according to patterns of genetic interactions revealed a functional 
relationship between the elongator complex and the urmylation pathway, which act in 
concert to modify specific tRNAs. 



Mitochondria 



Ribosome and 
translation ■ 




Metabolism and 

amino acid 
^ biosynthesis - 



Chromatin,' '■■ 
and 

transcriptional J** 

Nuclear- ■*■ - cj, 1 
cytoplasmic T?t^ 






Secretion and 

vesicle 

transport 



Protein folding 

and glycosylation 

cell wall 

biosynthesis 



transport 






Nuclear 

migration ** 1 

and protein Mitosis and chr. DNA replication 
degradation segregation and repair 



' Cell polarity an 
morphogenesis 



Autophagy _ 



Endosome and 
vacuole sorting 



Amino acid 
biosynthesis* 
and uptake 




tRNA 

■ modification 



- ■ Cell wall biosynthesis 
and integrity 

Protein folding and 
glycosylation 



ER/golgi 






c 


1 


CIT2 
GDH1 

\ URE2 


Glutamate 






l^A \^ * 


biosynthesis 




MKSi \ Jf/^ *' 
YPT7-^^^__^ ^Sj£s?&& All* \> 


o 


^^?^ — r TG 3 


SSSS^fisTr 






MONH'J i ^ > B 






a 


" 




.«^RTGr?j^ffl 


sEsSfe^w 


T3 




•v Axlfi 




^SER2SlOM2 


3 
3 








'•JjX^Sr*AR01 






''" '/ X ^vrl 






LV1|0»AJa 


M 




i «t_X ' x> — /*Uf 













' K ! jv — * / Y- 






^wSQalJ&tJx ' jH- 














M 


VPS1 //^i_ 








o 








M-tflsr^* 




^^VPS16^2----|- 




tg/f '^><4HOM3 
jjAR02.r r A 


■— 










^^■fc^jLIMw-^ Lift 


/5&eciS lst ^| 







PEP3 vAM^^ai 


h^/iGW$ seh 'I 


Egg^rHomoserine, 


s 


^sj 




By^ chorismate 


Hops/ n nH 




and serine 


corvet M 


11 fffil/jP'- biosynthesis 






ECM30 PAR32 








UBP15 








Gapl sorting 








pathway 





Figure 7.5 (Continued) 



164 Anastasia Baryshnikova et al. 

(query genes) and columns (array genes) of a genetic interaction matrix to 
place genes sharing similar genetic profiles next to each other. On the 
resulting clustergram, that is easily visualized with an open source applica- 
tion like Java Treeview (http://jtreeview.sourceforge.net), functionally 
related genes, including members of the same protein complex or pathway, 
are normally located in close proximity to each other and coclustering of 
genes of known function with poorly characterized genes provides strong 
evidence for cofunction (Tong et al., 2004). 

An alternative approach for clustering genetic interaction profiles con- 
sists in computing pair-wise correlation coefficients among all genes in a 
dataset (Costanzo et al, in press; Kim et al, 2001). The resulting data can be 
visualized as a network (using network visualization software such as 
Cytoscape (Shannon et ah, 2003)) where two genes are connected if their 
correlation exceeds a chosen threshold. The network can then be reorga- 
nized by a force-directed network layout in which highly correlated genes 
attract each other, while less correlated genes are repelled. Applying such an 
approach to the correlation-based genetic interaction network generates 
readily discernable clusters corresponding to distinct biological processes 
(Fig. 7. 5 A). This highly structured organization of the genetic map is 
maintained at increasing levels of resolution. For example, in one region 
of the global network, the related processes of endoplasmic reticulum 



Figure 7.5 The genetic landscape of the cell. (A) A correlation-based network con- 
necting genes with similar interaction profiles (Costanzo et al, in press). Genetic profile 
similarities were measured for all gene pairs by computing Pearson correlation coeffi- 
cients (PCC) from the complete genetic interaction matrix. Gene pairs whose profile 
similarity exceeded a PCC > 0.2 threshold were connected in the network. An edge- 
weighted, spring-embedded network layout, implemented in Cytoscape (Shannon 
et al, 2003), was applied to determine node position based on genetic profile similarity. 
This resulted in the unbiased assembly of a network whereby genes sharing similar 
patterns of genetic interactions are proximal to each other in two-dimensional space, 
while less-similar genes are positioned further apart. Circled regions correspond to gene 
clusters enriched for the indicated biological processes. (B) Magnification of the func- 
tional map resolves cellular processes with increased specificity and enables precise 
functional predictions. A subnetwork corresponding to the indicated region of the 
global map is shown. Node color corresponds to a specific biological process; amino 
acid biosynthesis and uptake (dark green); signaling (light green); ER/Golgi (light 
purple); endosome and vacuole sorting (dark purple); ER-dependent protein degrada- 
tion (yellow); protein glycosylation, cell wall biosynthesis and integrity (red); tRNA 
modification (fuchsia); cell polarity and morphogenesis (pink); autophagy (orange); 
uncharacterized (black). (C) Individual genetic interactions contributing to genetic 
profiles revealed by (B). A subset of genes belonging to the amino acid biosynthesis 
and uptake region of the network in (B). Nodes are grouped according to profile 
similarity and edges represent negative (red) and positive (green) genetic interactions. 
Nonessential (circles) and essential (diamonds) genes are colored according to the 
biological process indicated in (B) and uncharacterized genes are depicted in yellow. 
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(ER)/Golgi traffic, endosome/vacuole protein sorting, cell polarity, 
morphogenesis, cell wall integrity, protein folding, glycosylation, and 
ER-dependent protein degradation all cluster into well delineated groups 
(Fig. 7.5B). Furthermore, closer interrogation of the genetic map allows to 
distinguish uncharacterized genes located next to known functional clusters, 
suggesting previously unanticipated roles for the uncharacterized genes in 
the process (Fig. 7.5C). 

2.8. S. pombe SGA 

S. pombe or fission yeast is separated from S. cerevisiae by hundreds of millions 
of years of evolution and therefore provides an excellent system for explor- 
ing the conservation of genetic interactions. Moreover, 5. pombe never 
underwent an ancient genome duplication and thus unlike S. cerevisiae, 
where functional complementation by paralogous genes can sometimes 
obscure genetic interactions, the corresponding S. pombe ortholog often 
shows a rich genetic interaction profile (Dixon and Boone, unpublished 
data). In addition, several gene classes and mechanisms are also absent in 
5. cerevisiae but present in S. pombe and other eukaryotes (Aravind et al., 
2000). For example, S. pombe has an RNA interference (RNAi) pathway 
similar to that seen in other eukaryotes, while baker's yeast does not 
(White and Allshire, 2008). Thus, analysis of genetic interactions in S. pombe 
should provide not only complementary information to that obtained in 
S. cerevisiae, but also insight into processes that are inaccessible in 5. cerevisiae. 

The sequencing of the 5. pombe genome (Wood et al, 2002) and 
subsequent availability of a genome-wide deletion collection from the 
commercial company Bioneer (http://pombe.bioneer.co.kr/) has driven 
the development of SGA-like techniques for this organism (Dixon et al., 
2008; Roguevef */., 2007). For S. pombe SGA (SpSGA) (Dixon etal, 2008), 
we developed a protocol that enabled efficient isolation of double mutant 
haploids through use of the high-density arraying capabilities of the Singer 
RoToR system. 

A schematic illustrating the SpSGA procedure is shown in Fig. 7.6. 

2.8.1. SpSGA procedure 
Preparing the RoToR for use 

1. Turn on the RoToR software and hardware and sterilize the interior 
using the integral UV lamp. From the main menu, select: LAMP and 
then set the timer for a minimum of 30 min. 



Preparation of a 384-formatted query plate 

1. Grow query strain in 8 ml YES liquid media, with shaking, overnight at 
30 °C (or appropriate temperature for sensitive strains). 
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Figure 7.6 Outline of the SpSGA method. Cells of opposite mating type (h+, h— ) are 
mated on minimal SPA media and allowed to sporulate for 3 days at 26 °C. Then, to 
enrich for spores, mating plates are transferred to 42 °C for 3 days — a treatment that 
kills unmated haploid cells. Following spore enrichment, cells are transferred to rich 
medium to allow for germination, then transferred again to double-drug medium to 
select for recombinant double-mutant progeny. S. pombe haploids do not mate on rich 
medium; therefore, selection for a specific haploid mating type is not required. 

2. Pour liquid culture into an empty PlusPlate dish, ensuring the entire 
surface area is covered. 

3. Place the source dish containing the liquid culture in the black position. 
Place the YES (agar) target plate in the red position. With the RoToR, 
select: LIQUID HANDLING -> BATH -> BATH-96. As prompted, 
load the hopper with 96 long pin pads. Under OPTIONS check 
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ECONOMY and REVISIT SOURCE. Start the program. This will 
transfer a droplet of the culture to each position on the array in four steps, 
generating a 384 array (4 X 96). Caution. When in use, the RoToR 
robot arm and turntable move very quickly. Ensure that the safety cover 
is in place during operation and that hands and loose clothing are kept 
well free of the moving parts. 
4. Grow the query plate in an incubator at 30 °C for 2 days or until 
sufficient growth is obtained. With larger colonies, a single query plate 
can be used for multiple matings. 

Preparation of a 384-formatted array plate 

1. Plates obtained from Bioneer are arrayed in a 96-position format. These 
can be rearrayed 1:4, giving four copies of each query or rearrayed at 
higher density (possibly could use the R0T0R 4— >1 arraying protocol, 
although we have not tried that ourselves). 

2. Grow the array plate in an incubator at 30 °C for 2 days or until sufficient 
growth is obtained. 

Mating of query to array 

1. Place the YES query source plate in the blue position, the YES array 
source plate in the red position, and the target SPA mating plate in 
the black position. With the R0T0R, select: MATE — > 384. Load the 
hopper with the appropriate number of 384 short pin pads. Start the 
program. 

2. Remove the SPA mating dish and set aside for further processing. 
The query and/or array source dishes can continue to be used as 
needed. Critical step. It is essential that the query and array source plates 
are freshly grown (typically 2 days growth is optimal). Plates used after 
storage at 4 °C do not mate as efficiently. Depending on the amount of 
cell growth, a single 384-format query plate can be mated to 3—4 
separate array plates. However, further attempts may result in a reduced 
number of transferred cells, interfering with mating efficiency. 

3. Following the transfer of cells onto the SPA plate, use the R0T0R to 
transfer a drop of sterile H 2 onto the mated cells and mix them 
together. Load a PlusPlate dish containing sterile H 2 into the black 
position and a freshly mated SPA plate in the red position. On the 
R0T0R select: LIQUID HANDLING -> BATH -> BATH-96. 
Load 96 long pin pads. Under advanced options, select the Agar Mix 
tab and set a mix diameter of 0.2 mm. Start the program. Critical step. 
A droplet of water together with the punching of the agar is thought to 
facilitate access of each yeast to the other. This greatly enhances the 
mating efficiency, compared to no mixing conditions, and thereby 
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enhances sporulation substantially. This in turn reduces the number of 
unmated haploids that need to be eliminated in subsequent steps. 

4. Put SPA plates at 26 °C and allow cells to proceed through conjugation 
and sporulation for 3 days. 

5. Put SPA dishes at 42 °C for 3 days. Critical step. Ensure that the 
incubator is maintained at 42 °C. Higher or lower temperatures are 
not as efficient at eliminating unmated haploids. Note: significant tem- 
perature gradients may be observed within certain incubators, so it is 
advisable to measure the temperature at several places within the 
enclosure. 

6. Transfer spores to YES plates as follows. Load the SPA source plate into 
the red position and the YES target plate into the blue position. Using 
the RoToR, select: AGAR-AGAR -> REPLICATE -> REPLI- 
CATE ONE — » 384-1536. Load the hopper with 384 short pin pads 
and start the program using default advanced options with "revisit 
source" enabled. 

7. Pause point. Once spores have been transferred to YES plates, it is 
advisable to store the SPA mating plates at 4 °C. Spores remaining on 
this plate can be subsequently stored up to several months and used later 
as a source of spores for further confirmatory experiments, such as 
random spore analysis. 

8. Put YES plates at 30 °C and allow cells to germinate for at least 2 days. 

9. Transfer germinated cells from source YES plates loaded in the red 
position to YES + G418 + Nat target plates loaded in the blue posi- 
tion. With the RoToR, select: AGAR-AGAR -> REPLICATE -► 
REPLICATE ONE -> 1536-1536. Load the hopper with 1536 short 
pin pads and start the program using default advanced options. 

10. Transfer YES + G418 + Nat plates to 30 °C to allow cells to grow for 
2 days. 

11. Pause point. Once yeast cells have grown up on YES + G418 + Nat 
plates they can be further processed right away, or stored at 4 °C for up 
to several months for subsequent processing. 

12. Image plates to acquire colony size measurements or carry on with 
additional analysis of recombinant haploids. 




3. Media and Stock Solutions 



3.1. SGA media and stock solutions 



1. G418 (Geneticin, Invitrogen): Dissolve in water at 200 mg/ml, filter 
sterilize, and store in aliquots at 4 °C. 

2. clonNAT (nourseothricin, Werner Bio Agents, Jena, Germany): Dissolve in 
water at 100 mg/ml, filter sterilize, and store in aliquots at 4 °C. 



Synthetic Genetic Array (SGA) Analysis 169 

3. Canavanine (L-canavanine sulfate salt, Sigma, C-9758): Dissolve in water 
at 100 mg/ml, filter sterilize, and store in aliquots at 4 °C. 

4. Thialysine (S -(2 -amino ethyl) -l- cysteine hydrochloride, Sigma, A- 2 63 6): Dis- 
solve in water at 100 mg/ml, filter sterilize, and store in aliquots at 4 °C. 

5. Amino acids supplement powder mixture for synthetic media (complete): 
Contains 3 g adenine (Sigma), 2 g uracil (ICN), 2 g inositol, 0.2 g 
para-aminobenzoic acid (Acros Organics), 2 g alanine, 2 g arginine, 
2 g asparagine, 2 g aspartic acid, 2 g cysteine, 2 g glutamic acid, 2 g 
glutamine, 2 g glycine, 2 g histidine, 2 g isoleucine, 10 g leucine, 
2 g lysine, 2 g methionine, 2 g phenylalanine, 2 g proline, 2 g serine, 
2 g threonine, 2 g tryptophan, 2 g tyrosine, and 2 g valine (Fisher). 
Drop-out (DO) powder mixture is a combination of the above ingre- 
dients minus the appropriate supplement. Two grams of the DO 
powder mixture is used per liter of medium. 

6. Amino acids supplement for sporulation medium: Contains 2 g histidine, 
10 g leucine, 2 g lysine, 2 g uracil; 0.1 g of the amino acid supplements 
powder mixture is used per liter of sporulation medium. 

7. Glucose (Dextrose, Fisher): Prepare 40% solution, autoclave, and store at 
room temperature. 

8. YEPD: Add 120 mg adenine (Sigma), 10 g yeast extract, 20 g peptone, 
20 g bacto agar (BD Difco) to 950 ml water in a 2 1 flask. After 
autoclaving, add 50 ml of 40% glucose solution, mix thoroughly, 
cool to — 65 °C and pour plates. 

9. YEPD + G418: Cool YEPD medium to -65 °C, add 1 ml of G418 
stock solution (final concentration 200 mg/1), mix thoroughly, and 
pour plates. 

10. YEPD + clonNAT: Cool YEPD medium to -65 °C, add 1 ml of 
clonNAT stock solution (final concentration 100 mg/1), mix 
thoroughly, and pour plates. 

11. YEPD + G418/clonNAT: Cool YEPD medium to -65 °C, add 1 ml 
of G418 (final concentration 200 mg/1), and 1 ml of clonNAT (final 
concentration 100 mg/1) stock solutions, mix thoroughly, and pour 
plates. 

12. Enriched sporulation: Add 10 g potassium acetate (Fisher), 1 g yeast 
extract, 0.5 g glucose, 0.1 g amino acids supplement powder mixture 
for sporulation, 20 g bacto agar to 1 1 water in a 2 1 flask. After 
autoclaving, cool medium to —65 °C, add 250 fA of G418 stock 
solution (final concentration 50 mg/1), mix thoroughly, and pour plates. 

13. (SD/MSG) — His/Arg/Lys + canavanine/ thialysine /G4 18: Add 1.7 g 
yeast nitrogen base w/o amino acids or ammonium sulfate (BD Difco), 
1 g MSG (L-glutamic acid sodium salt hydrate, Sigma), 2 g amino acids 
supplement powder mixture (DO — His/Arg/Lys), 100 ml water in a 
250 ml flask. Add 20 g bacto agar to 850 ml water in a 2 1 flask. 
Autoclave separately. Combine autoclaved solutions, add 50 ml 40% 
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glucose, cool medium to ~65 °C, add 0.5 ml canavanine (50 mg/1), 
0.5 ml thialysine (50 mg/1), and 1 ml G418 (200 mg/1) stock solutions, 
mix thoroughly, and pour plates. Ammonium sulfate impedes the 
function of G418 and clonNAT. Hence, synthetic medium containing 
these antibiotics is made with monosodium glutamic acid as a nitrogen 
source (Cheng et ah, 2000). 

14. (SD/MSG) — His/Arg/Lys -\- canavanine /thialysine /clonNAT: Add 
1.7 g yeast nitrogen base w/o amino acids or ammonium sulfate, 1 g 
MSG, 2 g amino acids supplement powder mixture (DO — His/Arg/ 
Lys), 100 ml water in a 250 ml flask. Add 20 g bacto agar to 850 ml water 
in a 2 1 flask. Autoclave separately. Combine autoclaved solutions, add 
50 ml 40% glucose, cool medium to ~65 °C, add 0.5 ml canavanine 
(50 mg/1), 0.5 ml thialysine (50 mg/1), and 1 ml clonNAT (100 mg/1) 
stock solutions, mix thoroughly, and pour plates. 

15. (SD/MSG) — His/Arg/Lys + canavanine /thialysine /G41 8 /clonNAT: 
Add 1.7 g yeast nitrogen base w/o amino acids or ammonium sulfate, 
1 g MSG, 2 g amino acids supplement powder mixture (DO — His/ 
Arg/Lys), 100 ml water in a 250 ml flask. Add 20 g bacto agar to 850 ml 
water in a 2 1 flask. Autoclave separately. Combine autoclaved 
solutions, add 50 ml 40% glucose, cool medium to ~65 °C, add 
0.5 ml Canavanine (50 mg/1), 0.5 ml thialysine (50 mg/1), 1 ml G418 
(200 mg/1) and 1 ml clonNAT (100 mg/1) stock solutions, mix 
thoroughly, and pour plates. 

16. (SD/MSG) Complete: Add 1 .7 g yeast nitrogen base w/o amino acids or 
ammonium sulfate, 1 g MSG, 2 g amino acids supplement powder 
mixture (complete), 100 ml water in a 250 ml flask. Add 20 g bacto 
agar to 850 ml water in a 2 1 flask. Autoclave separately. Combine 
autoclaved solutions, add 50 ml of 40% glucose, mix thoroughly, cool 
medium to ~65 °C and pour plates. 

17. SD — His/Arg/Lys + canavanine /thialysine: Add 6.7 g yeast nitrogen 
base w/o amino acids (BD Difco), 2 g amino acids supplement powder 
mixture (DO — His/Arg/Lys), 100 ml water in a 250 ml flask. Add 20 
g bacto agar to 850 ml water in a 2 1 flask. Autoclave separately. 
Combine autoclaved solutions, add 50 ml 40% glucose, cool medium 
to ~65 °C, add 0.5 ml canavanine (50 mg/1) and 0.5 ml thialysine (50 
mg/1) stock solutions, mix thoroughly, and pour plates. This medium 
does not contain any antibiotics such as G418 or clonNAT and there- 
fore ammonium sulfate is used as the nitrogen source. 

18. SD — Leu /Arg/Lys + canavanine /thialysine: Add 6.7 g yeast nitrogen 
base w/o amino acids, 2 g amino acids supplement powder mixture 
(DO — Leu/ Arg/Lys), 100 ml water in a 250 ml flask. Add 20 g bacto 
agar to 850 ml water in a 2 1 flask. Autoclave separately. Combine 
autoclaved solutions, add 50 ml 40% glucose, cool medium to ~ 65 °C, 
add 0.5 ml canavanine (50 mg/1) and 0.5 ml thialysine (50 mg/1) stock 
solutions, mix thoroughly, and pour plates. 
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3.2. SpSGA media and stock solutions 

1. 250 mg/ml of G4 18 solution. Dissolve 2.5 g G418 in 10 ml of water. Filter 
sterilize with a 0.22 /im screw cap filter. 

2. 100 mg/ml of clonNat solution. Dissolve 1 g clonNat in 10 ml of water. 
Filter sterilize with a 0.22 /im screw cap filter. 

3. SPA-agar plates (SPA plates). Add 30 g agar, 10 g dextrose, 1 g of 
KH 2 P0 4 , and 1 ml of 1000 X vitamin stock to 1 1 of water. Autoclave 
and then pour 50 ml per PlusPlate dish. Allow to solidify and either use 
right away or store at 4 °C until needed. 

4. WOOx vitamin stock. Dissolve 1 g pantothenic acid, 10 g nicotinic acid, 
10 g inositol, and 10 mg bio tin in 1 1 of water. 

5. YES media and plates. Add 30 g glucose, 5 g yeast extract, 225 mg 
adenine, 225 mg L-histidine, 225 mg leucine, 225 mg uracil, and 
225 mg lysine to 1 1 of water. Autoclave and then cool to 55 °C before 
adding appropriate antibiotics (1:1000 dilution). To make solid media, 
add 30 g agar prior to autoclaving and pour 50 ml per PlusPlate. 




4. Applications of SGA Methodology 

Most large-scale studies have focused on fitness as the primary phenotype 
to identify genetic interactions (St Onge et ah, 2007; Tong et ah, 2004). 
In theory, all phenotypes are measurable and amenable to genetic interaction 
analyses. SGA methodology provides an efficient and systematic means for 
combining mutations and can be readily applied to identify additional genetic 
interactions that do not result in overt fitness defects. For example, reporter- 
gene constructs can be incorporated into the SGA methodology to monitor 
specific transcriptional responses in the ~ 5000 deletion mutant backgrounds 
(Costanzo et ah, 2004; Fillingham et ah, 2009) and used as an alternative to 
fitness for uncovering genetic interactions (Jonikas et ah, 2009). Integrating 
SGA technology with methodologies for measuring a diverse set of phenotypes 
generates networks that provide comprehensive genome coverage and accu- 
rately reflect global cellular functions. 

4.1. Integrating SGA and high-content screening 

Combining SGA technology with different cytological reporters and high 
content screening (HCS) methodologies also enables identification of mutant 
combinations that lack obvious growth defects but elicit subtle yet unexpected 
cell biological phenotypes (Vizeacoumar et ah, 2009, 2010). A HCS platform 
gathers cell biological information for genome -wide arrays of mutants by first 
acquiring cell images and then quantifying specific morphological phenotypes 
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following image processing (Fig. 7.7). The following section describes the steps 
involved in this procedure and can be applied to monitor virtually any subcel- 
lular event within the cell. The media required for various selections are 
described in detail in Section 3. 

1. Cross the SGA query strain expressing the cell biological reporter to the 
deletion array as described in Section 2.4. 

2. Once the MATa-kanMX4 meiotic progeny is selected on agar plates as 
described in Section 2.4, transfer the cultures to liquid. Select for 
MATa-kanMX4-natMX4 meiotic progeny by inoculating the cells in 
(SD/MSG) - His/Arg/Lys + canavanine/thialysine/G418/clonNAT 
liquid media in 96-well format plates. Note to include the auxotrophic 
selection of the reporter throughout the process. 

3. Grow liquid cultures overnight at the desired temperature in 96-well 
plates containing 3 mm glass beads (Fisher Scientific). Seal the plates with 
breathable adhesive membranes (Corning). 

4. Dispense appropriate volume of samples based on the optical density of 
each sample to 96-well filter plates (Whatmann) using a liquid handling 
robot (Biomek FX), to ensure uniform and optimal cell densities for 
subsequent image analysis. 

5. Vacuum the filter plates (NucleoVac96 vacuum manifold, Macherey- 
Nagel) and wash the cells in sterile distilled water using plate washer 
(Multidrop384 Liquid Dispenser System, Thermo Electron Corpora- 
tion). Vacuum the plates again so that only the washed cells remain in 
the filter plates. 

6. Add low-fluorescence medium (Sheff and Thorn, 2004) with appropri- 
ate drug selection to the filter plates to resuspend the washed cells and 
transfer cells from the filter plates to 96-well optical plates (Matrical) , 
using the liquid handler (Biomek FX) and let the cells settle down. Seal 
the plates with aluminum foil (VWR) to prevent drying of the samples. 

7. Load the plates into an automated incubator (Cytomat, Thermo Fisher 
Scientific, Inc.) and store the plates at 4 °C prior to imaging to avoid 
over growth. Automate the platform such that each plate is incubated at 
30 °C for 30—45 min prior to imaging. Using a robotic arm (CRS 
Catalyst Express, Thermo Electron Corporation) transfer the plates to 
the wide-field HCS imager (ImageXpress5000A, MDS Analytical, Inc.). 

8. Image plates using a 60 X air-objective collecting at least 4—6 images per 
well averaging to as many as 200 cells per mutant. Ensure proper storage 
of images in a database to readily access them for automated image 
analysis. 

9. Analyze the images using an image-analysis software with throughput 
capabilities (MetaXpress software vl.6, MDS Analytical, Inc.) and 
extract morphometric features to generate a unique profile for each 
mutant. Open source software developed by academic labs, such as 
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Figure 7.7 SGA-HCS pipeline for evaluation of cell biological phenotypes. (A) Using 
the SGA methodology, a fluorescent marker can be introduced into the arrayed 
deletion collection . (B) Deletion mutant colonies expressing a fluorescent marker are 
transferred to optical plates containing liquid selection media for imaging. (C) A robotic 
arm is used to move optical plates between incubators and the HCS imaging system. 
(D) Automated image analysis software, such as MetaXpress, is used to detect fluores- 
cent signal and measure morphological features. 
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CellProfiler, may also be used for this purpose (Carpenter, 2007). 
Briefly, after shade correction and background subtraction, apply a 
segmentation technique such as thresholding, so that the range of signal 
intensity that pertains to cellular objects is selected separating bright 
cellular objects from the dark image background. Following object 
identification, quantitative measurements of each cell can be used to 
extract numerous morphometric parameters, a feature available in most 
image-analysis software. 

Statistical analysis and data mining for multiplexed high content analysis 
(HCA) is still in its infancy. Hence, most HCA are custom designed and also 
require manual inspection of images (Bakal et al. , 2007; Corcoran et al. , 2004; 
Eggert et al, 2004; Gururaja et al, 2006; Loo et al., 2007; Narayanaswamy 
et al., 2006; Tanaka et al., 2005). However, a more straightforward approach 
is to compare each mutant population read out against that of the wild-type 
cell population, and identify statistically deviant mutants for any desired 
feature. Methods to standardize HCA are still in progress and the continued 
development of these procedures to process HCS data without sacrificing 
information accuracy will be of paramount significance. 

4.2. Essential gene and higher order genetic interactions 

This chapter focuses on the application of SGA analysis to generate double 
mutant strains and identify genetic interactions among nonessential deletion 
mutants. However, SGA can also be applied to examine synthetic genetic 
interactions involving essential genes. For example, an SGA query strain can 
be crossed to the Tet-promoter collection (yTHC, Open Biosystems), 
double mutants can be selected and scored for growth defects in the 
presence of doxycycline, which downregulates the expression of the essen- 
tial genes (Davierwala et al., 2005). Other essential gene mutant collections 
(Ben-Aroya et al., 2008; Schuldiner et al., 2005) are now available in arrayed 
formats and are amenable to genetic interaction analyses using SGA 
technology (Costanzo et al., in press). 

While digenic interactions are more commonly studied, SGA method- 
ology can be easily applied to examine higher order genetic interactions 
involving more than two genes (Tong et al. , 2004) . 

4.3. Combining SGA and gene overexpression libraries 

In addition to loss-of-function mutations, SGA can be easily adapted to 
examine different forms of genetic interactions involving high-copy plas- 
mid or regulatory expression of yeast or heterologous genes. Several gene 
overexpression libraries have been constructed, in recent years (Gelperin 
et al. , 2005; Hu et al. , 2007; Jones et al. , 2008; Zhu et al. , 2001) . One of these 
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libraries was used to assemble a Yeast Overexpression Array, containing 
~6000 ORFs (Yeast GST-Tagged Collection, Open Biosystems), and was 
combined with SGA to screen for synthetic dosage lethality and suppression 
(Sopko et ah, 2006). This proved to be a successful strategy for identifying 
downstream targets regulated by specific signaling pathways (Sopko et ah, 
2006, 2007). The development of new plasmid libraries carrying genes 
under inducible expression will expand the potential for dosage lethality 
(Gelperin et ah, 2005; Hu et ah, 2007), whereas the development of 
libraries in which each gene is under the control of its own promoter 
(Ho et ah, 2009; Jones et ah, 2008) offers the potential for building other 
overexpression arrays that may be particularly useful for dosage suppression 
experiments. 

4.4. Applying SGA as a method for high-resolution 
genetic mapping (SGAM) 

Because double mutants are created by meiotic recombination, a set of gene 
deletions that is linked to the query gene, which we refer to as the "linkage 
group" form double mutants at a reduced frequency, thus, appearing 
synthetic lethal/sick with the query mutation. Since the gene deletions 
represent mapping markers covering all chromosomes in the yeast genome, 
SGA mapping (SGAM) has been shown as an effective method for high- 
resolution genetic mapping (Chang et ah, 2005; Jorgensen et ah, 2002). 
In addition to mapping of recessive alleles, SGAM is particularly useful for 
rapid mapping of dominant mutations, which are challenging to clone using 
standard techniques (Menne et ah, 2007). 

4.5. Chemical genomics 

Because chemical perturbations mimic genetic perturbations, genetic net- 
works also provide a key for predicting the targets of inhibitory bioactive 
molecules (Parsons et ah, 2004) (Fig. 7.8). If a compound inhibits a specific 
target protein, then the chemical-genetic profile, the set of mutants that are 
hypersensitive to the compound, should overlap with the genetic interac- 
tion profile of the target gene. As a result, the compound and its target 
should cocluster together and with other genes and compounds involved in 
the same biological process (Fig. 7.8). 

The integration of ~ 1700 genetic interaction profiles with ~400 drug 
sensitivity profiles (Costanzo et ah, in press) proved that compounds with 
known functions, such as hydroxyurea, which inhibits DNA synthesis, or 
tunicamycin, which inhibits glycosylation, cluster to their expected biological 
process (Giaever et ah, 1999). Furthermore, the target of a novel drug, now 
named Erodoxin, was identified via inspection of a combined chemical- 
genetic correlation network. Thus, this chemical-genetic approach to 
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Figure 7.8 Chemical-genetic interactions can be modeled by synthetic genetic inter- 
actions. (A) In a chemical-genetic interaction (at left), a deletion mutant, lacking the 
product of the deleted gene (represented by a black X), is hypersensitive to a normally 
sublethal concentration of a growth-inhibitory compound. In a synthetic lethal genetic 
interaction (right), two single deletions lead to viable mutants but are inviable in a 
double-mutant combination. Gene deletion alleles that show chemical-genetic interac- 
tions with a particular compound should also be synthetically lethal or sick with a 
mutation in the compound target gene. (B) Comparison of a chemical-genetic profile to 
a compendium of genetic interaction (synthetic lethal) profiles should identify the 
pathways and targets inhibited by drug treatment. In this hypothetical figure, chemi- 
cal-genetic and genetic interactions are both designated by red squares. For example, 
deletion mutants 3, 5, 6, and 7 are hypersensitive to compound X and a mutation in 
query gene A leads to a fitness defect when combined with deletion alleles 1, 2, 3, and 4. 
Here, the chemical-genetic profile of compound X resembles the genetic profile of 
gene B, thereby identifying the product of gene B as a putative target of compound X. 

mode-of-action analysis complements haploinsufficiency profiling, which 
focuses on identifying the drug target directly (Giaever et al., 1999; 
Hillenmeyer et al. , 2008) . 
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Abstract 

The study of temperature-sensitive (Ts) mutant phenotypes is fundamental to 
gene identification and for dissecting essential gene function. In this chapter, 
we describe two "shuffling" methods for producing Ts mutants using a combi- 
nation of PCR, in vivo recombination, and transformation of diploid strains 
heterozygous for a knockout of the desired mutation. The main difference 
between the two methods is the type of strain produced. In the "plasmid" 
version, the product is a knockout mutant carrying a centromeric plasmid 
carrying the Ts mutant. In the "chromosomal" version, The Ts alleles are 
integrated directly into the endogenous locus, albeit not in an entirely native 
configuration. Both variations have their strengths and weaknesses, which 
are discussed here. 
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1. Introduction 

The study of temperature-sensitive (Ts) mutant phenotypes has 
proven to be a fundamental approach both for the identification of gene 
sets essential for various aspects of biology and for obtaining a detailed 
understanding of essential gene function. While the observation that 
temperature-sensitive mutations represent a general class of mutation was 
recognized in the 1950s (Horowitz, 1950), the first targeted screen, 
isolation, and analysis of Ts mutants (382 mutations located in 37 genes 
scattered widely over the bacteriophage T4 genome) were made by Edgar 
and Lielausis in 1963 (Edgar and Lielausis, 1964). Hartwell (1967) reported 
the isolation of 400 Ts mutations in Saccharomyces cerevisiae, which caused 
defects in essential processes including cell division, and protein, RNA, 
and DNA synthesis. Over the past 40 years, the isolation and analysis of Ts 
mutations in essential genes has been a linchpin technology for investigating 
the genetics and molecular biology of essential processes in all experimental 
organisms. 

Ts mutations are typically missense mutations, which retain the function 
of a specific essential gene at standard (permissive) low temperature, lack 
that function at a defined high (nonpermissive) temperature, and exhibit 
partial (hypomorphic) function at an intermediate (semipermissive) temper- 
ature. Such mutants make possible the analysis of physiologic changes 
that follow controlled inactivation of a gene or gene product by shifting 
cells to a nonpermissive temperature, offering a powerful approach to the 
analysis of gene function. 

Essential genes, by definition, encode critical cellular functions that are 
not buffered by redundant functions or pathways (Hartman et ah, 2001). 
Essential genes have been shown to be highly dense hubs within genetic 
interaction networks and are involved in all aspects of basic cellular function 
(Jeong et al, 2001). Furthermore, essential genes tend to be more highly 
conserved in evolution; 38% of essential yeast proteins have easily identifi- 
able counterparts in humans, versus 20% for nonessential genes (Hughes, 
2002). 

Despite their importance, the functions of many essential yeast proteins 
have not been studied. In part, this is due to the absence of essential gene 
representation in the genome-wide haploid mutant collections, which 
cover all of the ~5000 nonessential yeast genes. Thus, no comparable 
systematic haploid mutant collection currently exists for the ~ 1000 essential 
genes in S. cerevisiae. The frequency of sites mutable to a reduced or 
conditional function is highly gene-specific; for example, for highly 
conserved proteins, random single missense mutation would be expected, 
for the vast majority of positions within the protein, to cause complete loss 
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of function when mutated. Therefore, genetic screens using a random 
mutagenesis approach rarely reach saturation because "mutability" varies 
widely among genes. 

Here, we report detailed protocols for two methodologies that allow the 
systematic isolation of Ts alleles in essential genes of interest. The first 
method is plasmid-based, and the second is genome integration-based, 
and each has its specific advantages depending on the application. Both 
methods exploit features of the ' naploid-convertible" heterozygous diploid 
collection, which allows introduction of the library of mutagenized essential 
gene copies into the heterozygous diploid and subsequent direct selection of 
haploids which are deleted for the target essential gene and that carry 
individual members of the mutagenized essential gene library, using the 
"diploid shuffle" technique (see below). Several other useful corollary 
methods for transferring extant Ts alleles or specific gene constructs (e.g., 
fusion proteins) are also presented. Finally, we note that our laboratories 
are in the process of generating a complete set of Ts alleles for each of the 
essential genes in S. cerevisiae, which will be distributed as a resource to the 
scientific community when completed (see Ben-Aroya et ah, 2008 for 
details). For specific essential genes under study in individual laboratories, 
however, it may be useful, using the methods described here, to generate an 
additional series of independent Ts alleles for detailed functional analysis. 
Furthermore, mutagenized libraries of specific essential or nonessential 
genes can be screened for conditional viability under a variety of conditions 
that are normally sublethal in the wild-type strain (e.g., sublethal doses of 
drugs) using the methods described here. 




2. Diploid Shuffle — Plasmid Method 

2.1. General description of the diploid 
shuffle — plasmid method 

Much like traditional plasmid-shuffling methods, mutants generated by this 
version of the diploid shuffle are plasmid-borne alleles that can be very easily 
transferred to and tested in different strain backgrounds. The experimental 
procedure is outlined in Fig. 8.1. First, the endogenous promoter (including 
the S'-untranslated region, 5 7 -UTR, ~500 bp) and the terminator 
(3 7 -UTR, rsj 500 bp) of a gene of interest (or your favorite gene, YFG) 
are PCR-amplified and cloned in tandem onto a centromere-based yeast— 
Escherichia coli shuttle vector, which contains URA3 as the selectable marker 
in yeast cells, such as pRS416. The resultant promoter/terminator clone is 
subsequently linearized with an endonuclease, typically Noil, which cuts at a 
site preengineered between the promoter and terminator. Simultaneously, 
the sequence of YFG, including the whole open reading frame (ORF), 
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Figure 8.1 A schematic for creating conditional alleles using plasmid-chromosome 
shuffle. The promoter (5 ; ) and terminator (3 ; ) of YFG are separately PCR-amplified 
with primer pairs PF/PR and TF/TR, respectively, and cloned together onto a centro- 
meric (CEN) yeast-£". coli shuttling vector. Here, PF stands for promoter forward, PR for 
promoter reverse, TF for terminator forward, and TR for terminator reverse. A Notl 
recognition site is engineered between the promoter and terminator. The resultant pro- 
moter/terminator clone is linearized with Notl digestion. In the meanwhile, the entire 
sequence of YFG gene, including the coding region and the promoter and terminator 
sequences, is mutagenized using error-prone PCR with the primer pair PF/TR. The 
mutagenesis PCR products and the linearized promoter/ terminator plasmid DNA are 
mixed and transformed together into a haploid-convertible heterozygous diploid knock- 
out mutant of the same gene (MATa/u YFG/yfgA::kanMX4 CANl/canlA::LEU2- 
MFAlpr-HIS3). The linearized promoter /terminator clones are repaired inside yeast 
cells mostly via homologous recombination using the cotransformed mutagenesis products 
(or YFG* alleles) as the templates. Due to the extensive homology between the ends of the 
PCR product and the vector, > 1 recombinant clones can be easily generated. This pool 
of recombinants is then sporulated. Haploid MATa G418 Ura + cells are selected under a 
permissive condition on solid SC-Ura-Leu-His-Arg+G418+Can medium as single colo- 
nies, which are replica-plated onto two fresh plates and incubated under both permissive 
and nonpermissive conditions. Candidate alleles will grow under the permissive condition 
but not under the nonpermissive condition. 
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complete with the promoter and terminator regions, is randomly mutagen- 
ized with error-prone PCR. The linearized promoter/ terminator plasmid 
and PCR products are then combined and cotransformed into a haploid- 
convertible heterozygous diploid deletion mutant (YFG/yfgA::kanMX4) 
in the same gene being mutagenized. The mutagenized PCR product is 
thereby cloned into the URA3 plasmid via recombination mediated by the 
terminal homologous DNA sequences of both the PCR products and the 
linearized vector. The Ura transformants are subsequently cultured in a 
sporulation medium and converted into haploid cells by growing on a 
medium that allow growth of only haploid MATa. G418 Ura cells. 
In these cells, the chromosomal wild-type copy of YFG is deleted, allowing 
direct observation of any phenotypes of the plasmid-borne alleles. To screen 
for conditional alleles, such haploid cells are first grown under a permissive 
condition such as low temperature. Colonies formed are subsequently 
replica-plated to fresh plates at permissive and nonpermissive conditions. 
Conditional alleles are identified as those grow under the permissive but not 
the nonpermissive condition and subsequently verified. This method has 
been used to create thermosensitive (Ts) alleles of multiple essential genes 
as well as a large collection of methyl methanesulfonate (MMS) hypersensi- 
tive alleles of POL30 (Huang et ah, 2008; Lin et ah, 2008). Here, we will 
outline the detailed methods to generating and verifying Ts alleles of an 
essential gene. 



2.2. Materials 
2.2.1. Media 

Haploid selection synthetic medium SC—Ura— Leu— His— Arg-\- G4 1 8-\- Can: dextrose, 
20 g/1; yeast nitrogen base without amino acids and ammonium sulfate, 1 .7 g/1; 
SC— Ura— Leu— His— A dropout mix, 2 g/1; sodium glutamate, 1 g/1; G418, 200 
mg/1; L-canavanine (Sigma, Cat# CI 625), 60 mg/1; Agar, 2%. The sodium 
glutamate is substituted for ammonium sulfate as the nitrogen source and makes 
the G418 selection more reliable on the minimal medium. 

SC—Ura: dextrose, 20 g/1; yeast nitrogen base without amino acids and 
ammonium sulfate, 1.7 g/1; SC—Ura dropout mix, 2 g/1; ammonium sulfate, 
5 g/1; Agar, 2%. 

Liquid YPD: yeast extract, 10 g/1; peptone, 20 g/1; dextrose, 20 g/1. 

Solid and liquid sporulation medium: potassium acetate, 10 g/1; zinc acetate 
0.05 g/1, with or without 2% agar, respectively. 

Solid Luria Broth (LB) plus carbenicillin: yeast extract, 10 g/1; Tryptone, 
5 g/1; sodium chrolide, 10 g/1; carbenicillin, 50 mg/1; Agar, 2%. 

Liquid LB plus ampicillin: yeast extract, 10 g/1; Tryptone, 5 g/1; sodium 
chrolide, 10 g/1; ampicillin, 50 mg/1. 
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2.2.2. Yeast strains 

The haploid-convertible heterozygous diploid knockout mutants (MAT 
a/oc ura3A0/ura3A0 leu2A0/ku2A0 his3Al/his3Al lys2A0/LYS2 metl5A0/ 
MET15 canlA::LEU2-MFAlpr::His3 /CAN1 YFG/yfgA::KanMX; Open- 
Biosystems Cat# YSC4428) (Pan et ah, 2006) are used to screen for Ts alleles. 
Chemically competent DH5a cells prepared as described (Inoue et ah, 1990) 
are used for cloning and plasmid recovering from yeast. 

2.2.3. Plasmids 

It is essential for this method to use a plasmid vector that contains the YFG 
promoter and terminator separated by a unique endonuclease (typically 
Notl) recognition site. Due to the limited auxotrophic markers available 
in the haploid-convertible heterozygous diploid knockout mutants, we 
normally use plasmids containing URA3 as the selectable marker such as 
pRS416 (Brachmann et ah, 1998; Sikorski and Hieter, 1989) and YCplac33 
(Gietz and Sugino, 1988). Other URA3 CEN vectors should also work. 

2.2.4. Yeast genomic DNA 

A genomic DNA sample isolated from the wild-type yeast strain BY4743 
MATa/oc (Brachmann et ah, 1998) was used as the template for cloning the 
promoter and terminator of YFG and for mutagenizing its entire sequence 
with PCR. 

2.3. Methods 

2.3.1. Constructing the promoter/terminator clone 

In the past, we mostly used endonuclease restriction enzyme digestion and 
ligation to construct the promoter/terminator clone. First, the promoter and 
terminator of YFG are separately PCR-amplified using primers that contain 
endonuclease recognition sites, for example, Hindlll/ Notl for the promoter 
and Notl/ BamHl for the terminator. The PCR products are then digested 
with Hindlll/ Notl and Notl/ BamHl, respectively, and ligated to pRS416 
(or YCplac33) digested with Hindlll/ BamHl in a 3-piece ligation reaction. 
The ligation products are transformed into DH5a competent cells and candi- 
date clones are selected on solid LB plus carbenicillin. More recently, we have 
adopted a modified version of the sequence and ligation independent cloning 
(SLIC) procedure (Li and Elledge, 2007) (Fig. 8.2). This method does not 
require endonuclease digestion of the inserts and thus greatly simplifies primer 
designs and experimental procedures, especially when a large number of genes 
are processed simultaneously. This is also a relatively new cloning technique 
and is thus described below in greater detail. 

1. Design four PCR primers: promoter forward (PF), promoter reverse 
(PR), terminator forward (TF), and terminator reverse (TR). In addi- 
tion to gene-specific sequences on their 3' termini, the PF and TR 
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Figure 8.2 Constructing a promoter/ terminator clone using sequence and ligation 
independent cloning (SLIC). The promoter (5 ; ) and terminator (3') of YFG are sepa- 
rately PCR-amplifled from yeast genomic DNA with primer pairs PF/PR and TF/TR, 
respectively. In the meanwhile, a centromeric yeast-E. coli shuttling vector is linearized 
with endonuclease digestion at the multicloning site (MCS). The PCR products and the 
linear vector plasmid are mixed together and processed with T4 DNA polymerase to 
create 5' single-stranded overhangs. The PCR primers are designed in such a way 
that the PCR products and the vector can be assembled via a homology-mediated 
single-strand annealing process. A Noil site is engineered between the cloned promoter 
and terminator. An aliquot of the annealing reaction is transformed into E. coli compe- 
tent cells. This is a modified version of the SLIC procedure originally described by 
LiandElledge (2007). 
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primers each contains a 30 bp sequence at the 5' end that is either 
identical or complimentary to the ends of pRS416 (or YCplac33) 
linearized at the multicloning site (Fig. 8.2). The PR and TF primers 
are completely complementary and both contain three parts: ~20 bp 
promoter- or terminator-specific sequences on both ends and a Notl 
recognition site (8 bp) in the middle (Fig. 8.2). 

2a. PCR amplify the promoter and terminator according to the condi- 
tions listed in Table 8.1 . Here genomic DNA from BY4743a/a is used 
as the PCR template. Platinum Pfx DNA polymerase (Invitrogen, 
Cat# 11708) is used due to its relative robustness and high fidelity. 
Other enzymes with similar features such as the Phusion (New 
England Biolabs, NEB, Cat# F-540) and KOD (Novagen, Cat# 
71085-3) DNA polymerases can also be used. 

2b. Digest ~1 fig of pRS416, YCplac33, or any other URA3 plasmid 
with an endonuclease at the multicloning site to generate ends that are 
competent in cloning the PCR products via SLIC. 

3 . Gel-purify the PCR products and the digested vector DNA using a gel 
extraction kit (Qiagen, Cat# 28706 or equivalent) by following 
the manufacturer's instruction. Here the PCR products of both the 
promoter and terminator are combined during purification. Elute each 
purified sample in 25 jA of provided elution buffer. 

4. Set up a reaction according to Table 8.2. Here DNA resection in the 
3'-5' direction (by T4 DNA polymerase, NEB, Cat# M0203) and 
annealing between complementary single-strand overhangs occurs in 
the same reaction. 

5. Immediately use 2 jA of the above reaction to transform 20 jA of 
chemically competent DH5a cells by following a standard protocol. 
This includes incubating the cell/DNA mixture sequentially on ice 
for 30 min, at 42 °C for 90 s, back on ice for 2 min, and at 37 °C for 
30 min (in the presence of 200 jA of liquid LB medium). 

Table 8.1 PCR amplification of promoters and terminators 



Component 


Volume/reaction (^1) 


10 X Platinum Pfx DNA Polymerase buffer 


2.5 


dNTP (2.5 mM each) 


2 


Primer mix (5 fiM each) 


2 


Yeast genomic DNA (~200 ng//il) 


1 


ddH 2 


17.25 


Platinum Pfx DNA Polymerase (2.5 U//il) 


0.25 


Total 


25 



PCR conditions: 94 °C 4 min; 30 x (94 °C for 30 s, 55 °C for 30 s, 72 °C for 45 s); 72 °C for 7 min; 
hold at 4 °C. 
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Table 8.2 AT4 DNA polymerase resection reaction 



Component 


Volume ([A) 


10 X NEB BamHl Buffer 


2 


lOx BSA 


2 


ddH 2 


3.7 


T4 DNA polymerase (3 U/jA) 


0.3 


Vector (100 ng/^1) 


2 


PCR product (50 ng//il) 


10 


Total 


20 



Upon set up, the reaction is incubated at 25 °C for 30 min. Then, place it on ice until E. coli 
transformation. 



6. Plate 100 jA of the transformation mixture on solid LB plus carbeni- 
cillin to select for single colony transformants. 

7. Screen for positive clones using colony-PCR with the PF and PR 
primers according to Table 8.1 but with subtle modifications. Here 
cells from single colonies instead of yeast genomic DNA is used to 
provide PCR templates. 

8. Prepare plasmid DNA samples from two to three positive clones using 
a mini-spin kit (Qiagen, Cat# 27106 or equivalent) by following the 
manufacturer's instructions. 

9. Verify each plasmid with DNA sequencing by using two primers that 
read toward the inserts (promoter and terminator) in both directions 
from the vector backbone. 



2.3.2. Mutagenizing YFG with error-prone PCR 

Mutagenesis of YFG using error-prone PCR is performed essentially as 
described previously (Leung et ah, 1989). Again, a genomic DNA sample of 
the wild-type yeast strain BY4743a/a is used as the DNA template. The PF 
and TR primers described above are used to amplify the full-length gene 
and ~500 bp flanking sequences. TaKaRa Ex Taq (Cat# RR001A) or LA 
Taq (Cat# RR002B), which are ^4 times more accurate than the normal 
Taq polymerase, are used here due to their robustness. Induction of muta- 
tion rates is achieved by adding Mn in the PCR at a final concentration of 
10—150 fiM that are arbitrarily defined, with higher concentrations for 
smaller genes (genes' sizes range from 0.5 to 5 kb). 

1. Set up four independent reactions for each gene according to Table 8.3 
to reduce potential founder effects. 

2. After PCR, pool the samples. 

3. Examine the PCR products by agarose gel electrophoresis by using a 
1—2 fA sample to ensure successful PCR amplification. 
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Table 8.3 Error-prone PCR conditions 



Component 


Volume/reaction (^1) 


10 X Ex Taq buffer 


5 


dNTP (2.5 mM each) 


4 


Primer mixture (5 fiM each) 


4 


Yeast genomic DNA (~200 rig/ jA) 


2 


MnCl 2 (1-15 mM) 


0.5 


ddH 2 


34.25 


Ex Taq DNA polymerase (5 U/ml) 


0.25 


Total 


50 



PCR conditions: 94 °C 4 mm; 30 x (94 °C for 30 s, 55 °C for 30 s, 72 °C for 1 min/kb); 72 °C for 
7 min; hold at 4 °C. Lower MnCi2 concentrations are used for larger genes and higher concentrations 
for smaller genes. 

2.3.3. Linearizing plasmid DNA of the promoter/terminator clone 

Two micrograms of plasmid DNA of the promoter/ terminator clone is 
digested with Notl (NEB, Cat# R0189) in NEB buffer 3 in a 20 jA reaction. 
The reaction is incubated at 37 °C for an overnight and subsequently at 
65 °C for 20 min to inactivate Notl. A small aliquot of the digestion product 
is examined by agarose gel electrophoresis to ensure complete digestion of 
the plasmid. 

2.3.4. Combining and concentrating the PCR products 
and digested vector 

The mutagenized PCR products (approximately 10—20 fig in 200 fil) and 
linearized plasmid DNA of the promoter/terminator clone (~ 2 fig) are next 
combined and concentrated by ethanol precipitation. 

1. Transfer both the mutagenesis PCR products and Nort-digested pro- 
moter/terminator plasmid DNA into a 1.7 ml microcentrifuge tube and 
adjust volume to ~200 jA by adding ddH 2 0. 

2. Add 5 fA of 4 M ammonium acetate (pH 7.0) and 500 fA of 100% ethanol 
to the DNA sample and mix well by briefly vortexing. Place on ice for 
10 min. 

3. Precipitate DNA by spinning at > 12,000 rpm in a microcentrifuge for 
7 min. There should be a tiny whitish DNA pellet at the bottom of 
the tube. 

4. Carefully aspirate the liquid and wash the DNA pellet once with 300 jA 
of 70% ethanol. 

5. Spin at > 12,000 rpm in a microcentrifuge for 3 min and carefully 
aspirate ethanol. 

6. Dry the DNA pellet in a speed- vac. 

7. Resuspend DNA in 28 fA of sterile ddH 2 0. 
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2.3.5. Transforming yeast cells 

The concentrated DNA sample of PCR products and the linearized vector 
is next transformed into the corresponding haploid-convertible heterozy- 
gous diploid yeast knockout mutant to create a mutagenized library of YFG. 

1. Inoculate the haploid-convertible YFG/yfgA::kanMX4 heterozygous 
diploid mutant into 5 ml liquid YPD and incubate at 30 °C for 
overnight in a roller drum. 

2. Transfer an aliquot of the overnight culture into 50 ml YPD liquid 
(starting at 0.125 OD 600nm /ml in a 250 ml Erlenmeyer flask) and 
incubate at 30 °C with shaking (at 200 rpm) until a cell density of 
~0.5 OD 600nm /ml is obtained. 

3. Harvest the culture in a 50 ml conical tube by spinning at 4000 rpm for 
3 min in a bench top centrifuge and discard the medium. 

4. Resuspend cells in 10 ml of sterile ddH 2 0, centrifuge as described in 
the previous step, and discard the supernatant. 

5. Resuspend cells in 10 ml of 0.1 M lithium acetate (LiOAc), centrifuge, 
and discard the supernatant as before. 

6. Resuspend cells in residual 0.1 MLiOAc in a total volume of 100 jA in 
a 1.7 ml microcentrifuge tube. 

7. Make a transformation mixture in this order and mix well: 480 /il of 
50% polyethylene glycol (PEG-3350, JTBaker, Cat# JTU221-9), 72 fd 
of 1 M lithium acetate, 40 fA of heat-denatured herring sperm DNA 
(10 mg/ml, Sigma, Cat# D6898) and 28 fA of DNA sample to be 
transformed (error-prone PCR products and linearized promoter/ 
terminator clone). 

8. Add the transformation mixture into the yeast competent cells prepared 
in step 6 and immediately mixed well by pipetting with a PI 000 pipettor 
followed by vortexing (VMR, Vortexer 2) at top speed for 5—10 s. 

9. Incubate the transformation reaction in a 30 °C incubator for 30 min. 

10. Add 72 fA of dimethyl sulfoxide (DMSO; Qbiogene DMSO0001, 
Molecular Biology Grade) to the transformation reaction and immedi- 
ately mixed thoroughly by vortexing at top speed for 5—10 s. DMSO is 
intrinsically sterile and no further sterilization is needed. 

11. Incubate the transformation reaction in a 42 °C water bath for 13 min. 

12. Spin down cells at 3600 rpm in a microcentrifuge for 30 s to pellet cells. 

13. Aspirate the supernatant and resuspend cells in 1 ml of sterile ddH 2 0. 

14. Take 0.2 fA (1/5000) of the resuspended cells and plate on a SC— Ura 
plate to determine transformation yield. A successful reaction typically 
yields a library of 10 —10 independent Ura transformants, with 
>90% being the recombinants between the PCR products and 
the linearized plasmid DNA of the promoter/ terminator clone. The 
remainder is primarily empty vector molecules that were never cut in 
the first place or rejoined by nonhomologous end joining. 
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2.3.6. Speculation 

The rest of the yeast transformation reaction is either incubated in 50 ml 
of fresh liquid SC— Ura at 30 °C for 2 days to allow propagation of the 
library or can be sporulated immediately to convert the transformants into 
a library of haploid spores that harbor mutant alleles of YFG on a plasmid 
(see below). 

1. Grow the yeast transformants nonselectively in 50 ml of liquid YPD by 
incubating at 30 °C with shaking (180—200 rpm) for 3 h to refresh cells. 

2. Harvest cells in a 50 ml conical tube by spinning at 4000 rpm for 3 min in 
a bench top centrifuge and discard the medium. 

3. Resuspend cells in 40 ml of sterile ddH 2 and spin at 4000 rpm for 
3 min. 

4. Discard the supernatant and resuspend cells in 50 ml of liquid sporulation 
medium. 

5. Incubate the sporulation culture in a 250 ml flask at 25 °C for 4—6 days 
with shaking (180-200 rpm). This will typically give rise to 20-40% of 
sporulation efficiency when checked under a microscope. 

6. Repeat Steps 2 and 3. 

7. Discard the supernatant and resuspend cells in 10 ml of sterile ddH 2 0. 

8. Spread aliquots (200 /A) of 10 X serial dilutions of the sporulation culture 
onto individual plates of the haploid selection medium SC— Ura— Leu— 
His— Arg+G418 +Can and incubate at 25 °C for 2—3 days to determine 
the efficiency of producing MATa G418 Ura haploid cells. 

9. Store the rest of the spores in ddH 2 at 4 °C for later use. The viability 
of spores can be maintained for a few weeks in this way. 



2.3.7. Screening for Ts alleles 

After the titer of MATa G418 Ura + haploid cells is determined, the library 
is screened for potential Ts mutants. We typically screen ~4000 clones for 
each gene. 

1. Spread the spores on solid SC— Ura— Leu— His— Arg+G418+Can, aiming 
for a density of ~400 MATa G418 Ura haploid colonies formed on 
each of 10 plates. Store the rest of the spores at 4 °C as a backup. 

2. Incubate the plates at 25 °C for 3 days to allow formation of colonies of 
~ 2 mm in diameter. 

3. Replica-plate the colonies from each plate to two fresh plates of the same 
haploid selection medium and mark orientations of the plates. It is best to 
prewarm the ' nonpermissive" plate to 37 °C prior to replica plating. 

4. Incubate one of the daughter plates at 25 °C and the other at 37 °C for 
1 day. 

5. Compare growth of each colony on both plates to assess potential Ts 
phenotype and select alleles that form relatively robust colonies at 25 °C 
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but ghost colonies at 37 °C as candidate Ts mutants. If no Ts colony is 
detected, one may need to screen more clones by using the backup spores 
stored at 4 °C, by using 38 °C as the restrictive temperature, or both. 



2.3.8. Confirming Ts mutants 

Candidate mutants are picked and restreaked onto the same haploid selec- 
tion media and retested for the Ts phenotypes by incubating at 25 and 
37 °C. The plasmids are next recovered from those confirmed to be Ts in 
this initial assay and reintroduced individually into the same haploid- 
convertible heterozygous diploid mutant to test whether the Ts phenotype 
is linked to the plasmids. 

1. Grow each candidate Ts mutant in 1.5 ml of liquid SC— Ura at 25 °C 
until saturated, typically takes 1—2 days, depending on the particular 
alleles. 

2. Harvest cells of each strain in a microcentrifuge tube by spinning at 
4000 rpm for 1 min and discard the medium. 

3. Resuspend cells in 500—1000 fA of sterile ddH 2 and repeat Step 2. 

4. Resuspend cells in 40 fA Lysis Buffer (50 mM Tris-HCl 7.5, 10 mM 
EDTA) containing 5 mg/ml of Zymolyase 100T (MP Biomedicals, 
Cat# 320931) and incubate at 37 °C for 1 h with shaking at 15-min 
intervals. 

5. Add 40 fA of 10% SDS and mix well by vortexing or pipetting. 

6. Add 160 fA of 7.5 M ammonium acetate and mix well by vortexing or 
pipetting. 

7. Incubate the sample at —80 °C for 15 min. 

8. Centrifuge at > 12,000 rpm for 5 min at 4 °C. 

9. Carefully transfer 100 fA of clear supernatant to a new tube that 
contains 75 fA of isopropanol and mix well by inverting the tube several 
times. 

10. Centrifuge at > 12,000 rpm for 7 min at room temperature to precipi- 
tate DNA. 

11. Wash the DNA pellet once with 100 fA 70% ethanol. 

12. Carefully aspirate ethanol and dry the DNA pellet in a speed- vac. 

13. Resuspend DNA pellet in 20 fA of sterile ddH 2 0. 

14. Use 2 fA of the DNA sample to transform chemically competent DH5a 
cells as mentioned in the "constructing the promoter / terminator clone" 
section. 

15. Purify plasmid DNA from a representative bacterial transformant using 
a kit (Qiagen, Cat# 27106 or equivalent). 

16. Individually transform each plasmid into the haploid-convertible het- 
erozygous diploid knockout mutant as described previously (Gietz 
et ah, 1995) and select for Ura transformants. 
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17. Patch two representative Ura transformants for each plasmid on solid 
SC— Ura and incubate at 30 °C for overnight. 

18. Transfer cells to a sporulation plates and incubate at room temperature 
for 4—6 days. 

19. Resuspend cells from each sporulated culture in sterile ddH 2 0. 

20. Spot aliquots of 10 X serial dilutions onto two SC— Ura— Leu— His— 
Arg+G418+Can plates. 

21. Incubate one plate at 25 °C and the other at 37 °C for 2—3 days. 

22. Compare the growth of MATa G418 Ura cells at both tempera- 
tures. True Ts alleles will form colonies at 25 °C but not at 37 °C. 




3. Diploid Shuffle— Chromosome Method 

3.1. A general description 

The "diploid shuffle" chromosome method has now been used to system- 
atically screen for missense mutations that result in temperature-sensitive 
(Ts) alleles of hundreds of essential genes, with each allele directly integrated 
at its endogenous chromosomal location and flanked with the "barcodes" of 
the corresponding yeast knockout mutant (Fig. 8.3; Ben-Aroya et al, 2008, 
unpublished data). First, YFG, including its promoter and terminator 
regions, is mutagenized with error-prone PCR (Fig. 8.3A). The mutagen- 
ized PCR product is next cloned into SB221+Topo-TA (Fig. 8.3B). This 
plasmid contains the URA3 gene flanked by the 5' and 3 7 regions of 
KanMX. The Topo-TA cloning site (Invitrogen) has been inserted in 
between the KanMX 5' end and the URA3 gene. This site allows direct 
cloning of each of the PCR products, without the need for any further 
modifications. The result of the cloning step is a library of mutagenized 
YFG, which is then transformed into E. coli, and digested to release linear 
fragments (following DNA purification) (Fig. 8.3B). The linear fragments 
are directly transformed into the corresponding strain from the haploid- 
convertible heterozygous YFG/yfgA::kanMX diploid YKO collection by 
selecting for Ura transformants (Fig. 8.3C). The ~700 bp KanMX5' and 
KanMX3' fragments direct the mutagenized YFG library into the yfgA:: 
KanMX genomic locus via homologous recombination, with retention of 
the original bar codes flanking each gene (Fig. 8.3C). Pools of Ura cells 
containing the mutant alleles are sporulated (Fig. 8.3E). Spores thus formed 
are spread on a haploid selective medium and incubated at 25 °C for colony 
formation. Only haploid MATa. Ura spores containing the integrated 
mutant allele of YFG can grow on this medium. Colonies formed on 
selective medium are replica-plated and incubated at 37 °C (Fig. 8.3F). 
Colonies growing at 25 °C but not at 37 °C are selected as potential Ts 
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Figure 8.3 A diagram of the "Diploid shuffle" method for generating temperature- 
sensitive alleles. (A) Genomic DNA containing YFG and its 5 ; and 3 ; regions is used as 
the template for PCR mutagenesis. Two black horizontal arrows represent the gene- 
specific primers used. The mutagenized PCR products are cloned into the vector 
SB221+Topo-TA (mutations are represented by black stars). The Topo-TA cloning 
site is represented by a black T, the A overhang protruding from the PCR product is 
represented by a black A. Left gray bar represents the 5' half of the KanMX selectable 
marker (Kan), the right gray bar represents the other half of the KanMX selectable marker 
(MX). The NotI restriction sites are indicated by two diagonal black arrows. (B) The 
product of the cloning step is a library of a mutagenized YFEG. The library is then 
transformed into E. coli, and digested with Noil to release linear fragments (following 
DNA purification). (C) The linearized library is transformed into the corresponding 
heterozygous diploid strain. Bars that flank the KanMX knockout represent the two 
barcodes. (D) Heterozygous diploid transformants are sporulated (following meiosis), 
and MATa Ura + haploids spores are selected on haploid selective medium at 25 °C. 
(E) Selection of temperature-sensitive candidates following the replica plating and 
incubating at 25 and 37 °C. Back arrows identify a potential Ts allele. 
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alleles, and retested. In summary, the final product of the diploid shuffle 
approach is a confirmed MATa. strain from the YKO collection genetic 
background containing a URA3 marked Ts allele of a specific gene inte- 
grated into its endogenous locus and flanked by both barcodes. In addition, 
each strain contains a LEU2-MFAlpr-HIS3 reporter integrated at the 
CAN1 locus. 

In addition to creating Ts alleles, the diploid shuffle-chromosomal 
method can also be used to transfer existing alleles into the knockout strain 
background from other strain backgrounds and vice versa. Using the Topo- 
TA plasmid and protocol described in Fig. 8.1, any extant Ts allele can be 
easily transferred to the deletion collection genetic background (referred to 
as "allele transfer-in"). The result is an integrated allele, marked by URA3, 
and flanked by the appropriate barcodes. Moreover, using the Topo-TA 
plasmid, any PCR product (mutant allele, fusion protein, heterologous gene 
expression cassette, etc.) can be introduced at any of the 6000 genomic sites 
carrying a KanMX replacement cassette as the integration site, depending on 
the specific deletion mutant chosen as the recipient strain. By using primers 
that are external to the primers used for the original mutagenesis, the URA3 
marked Ts-allele or gene construct can be easily transferred from the 
deletion set genetic background to any other strain of interest, and replace 
the wild-type copy in the recipient strain by homologous recombination 
(referred to as "allele transfer- out," see Fig. 8.4). Thus, each Ts mutation or 
gene construct can be analyzed for more specific phenotypes of interest in 
a variety of genetic contexts. 



3.2. Materials 
3.2.1. Media 

Haploid selection medium SC—Ura—Leu—His—Arg-\-Can: dextrose, 20 g/1; yeast 
nitrogen base without amino acids and ammonium sulfate, 1.7 g/1; 
SC— Ura— Leu— His— Arg dropout mix, 2 g/1; sodium glutamate, 1 g/1; 
L-canavanine, 60 mg/1; Agar, 2%. 

YPD: yeast extract, 10 g/1; peptone, 20 g/1; dextrose, 20 g/1; Agar, 2%. 

YPD+G418: yeast extract, 10 g/1; peptone, 20 g/1; dextrose, 20 g/1; 
Agar, 2%; G418, 200 mg/1. 

Diploid selection medium SC—Ura-\-ClonNAT: dextrose, 20 g/1; yeast 
nitrogen base without amino acids and ammonium sulfate, 1.7 g/1; SC— 
Ura dropout mix, 2 g/1; sodium glutamate, 1 g/1; ClonNAT, 200 mg/1; 
Agar, 2%. The sodium glutamate is substituted for ammonium sulfate as 
the nitrogen source and makes the ClonNAT selection more reliable on the 
minimal medium. 

Others are the same as described in Section 2. 
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Figure 8.4 Allele transfer-out. Unless otherwise stated, all the symbols are as in Fig. 3. 
(A) In this specific example, genomic DNA containing a Ts allele that was generated by 
the diploid shuffle method PCR-amplified is used. Two black arrows represent the 
primers used for amplification. These primers are specific to the 5' and 3' regions of 
YFG in the recipient strain. They are also external to the two primers originally used to 
generate the Ts allele in the donor strain (represented by two broken arrows). (B) The 
PCR product is transformed to the strain of interest. The specific example shows allele 
transfer to a haploid strain. However, it may be desirable to transfer the allele to a 
diploid strain first, followed by sporulation and tetrad dissection, if the Ts allele being 
transferred is inviable in the specific genetic context of interest. The Ts allele replaces 
the wild-type YFG by homologous recombination (represented by dashed lines) and 
give rise to Ura transformants (indicated by the Ts phenotype) . 



3.2.2. Strains 

The genotype of haploid-convertible heterozygous diploid strains is similar 
to those described in Section 2. Haploid Ts strains have the following 
genotype: MATa ura3A0 leu2A0 his3Al lys2A0 (or LYS2) metl5A0 (or 
METIS) canlA::LEU2-MFAlpr::HIS3 yfg-ts::URA3. 
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BY47 A2-ade2101-NatMX is a Mv4Ta wild-type haploid strain in which 
the NatMX gene is linked to the ade2-101 ochre mutation. This strain is 
used to confirm the MATa. Ura + Ts candidate: MATa his3Al leu2A0 
lys2A0 ura3A0 ade2-101-NatMX. 

OneShot® TOP 10 Electrocomp Cells (Invitrogen Cat# C4040-52). 

3.2.3. Plasmids 

SB221 was derived from M4758 (Voth et al., 2003). In M4758, the Bglll/ 
Kpnl and Sphl/EcoRl fragments contain the TEF promoter (388bp) and 
terminator (262bp), respectively, of Ashbya gossypii. In SB221 these frag- 
ments were replaced by a BamHl/Kpnl PCR fragment (731bp) containing 
the TEF promoter plus half of the KanMX gene, and a Sphl/EcoILl fragment 
(751bp) containing the other half of the KanMX gene and the TEF termi- 
nator. In both cases, the template for PCR products was the KanMX gene 
used for constructing the heterozygous diploid collection. Finally, the 
BamHl site of SB221 was adjusted with a Topo-TA site (invitrogen) to 
create SB221 -Topo-TA. 

3.3. Methods 

3.3.1. PCR mutagenesis 

This can be carried out as described in Section 2. However, we have 
successfully used a slightly different condition for all the experiments using 
this diploid shuffle-chromosomal method. Here, we have exclusively used 
LA Taq DNA polymerase (Cat# RR002B) and 150 /iMMnCl 2 . One PCR 
of 50 jA, instead of four, is normally set up for each gene. Two primers, 
which allow amplification of the entire coding region, 250—300 bp of the 
5'-UTR, and 150-200 bp of the 3'-UTR of each gene, are used for PCR. 

3.3.2. Cloning the PCR products 

The mutagenized PCR products are purified and cloned into E. coli cells via 
electroporation. 

1. Purify the PCR product using the ChargeSwitch Clean-up Kit 
(Invitrogen Cat# CS 12000). 

2. Set up a ligation reaction that includes the following components: 
0.5 /A SB221 Topo-TA, 1.0 jA ligation Buffer (300 mM NaCl, 
15 mM MgCl 2 ), purified PCR product (100 ng/1 kb), and ddH 2 
to a total volume of 6 [A. 

3. Incubate the reaction at room temperature overnight. 

4. Store the reaction at 4 °C (if using soon) or at (—20 °C). 

5 . Take 1.5 jA of the ligation products and transform into an aliquot of 50 jA 
OneShot® TOP 10 Electrocomp Cells (Invitrogen Cat# C4040-52) via 
electroporation. 
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6. Add 200 fA of LB into the transformation and incubate at 37 °C for 1 h. 

7. Plate a 1 fA aliquot onto a LB plus ampicillin (LB-amp) plate to estimate 
the transformation efficiency. 

8. Transfer 100 fA of the transformation suspension into a flask containing 
50 ml of LB-amp liquid media and incubate at 37 °C for ~12 h to 
amplify the plasmid library. If necessary, store the rest of the transfor- 
mation suspension (approximately 150 fA) in 20% glycerol at —70 °C 
for later use. 

9. Count the number of colonies on the 1 fA plate and multiply by 
the microliters inoculated into the LB-amp liquid media to estimate 
the complexity of the library. (The total number of colonies should be 
-150,000-200,000). 

10. Purify the plasmid DNA library using a Plasmid Midi prep kit (Qiagen 
Cat# 12745). 

11. Digest 2 fA of 10 X diluted library DNA sample with Notl (10 U//il, 
NEB, Cat# R018S) in a 10 fA reaction and examine the digests with 
agarose gel electrophoresis. This will allow estimation of the overall 
ligation efficiency. 2.5 kb+2.7 kb fragments represent the empty vector 
(Fig. 8.5A), while a successful cloning reaction is indicated by a band shift 
of the 2.7 kb fragment (in accordance with the insert size) (Fig. 8.5B). 

3.3.3. Yeast transformation 

Purified plasmid DNA sample of the random mutagenesis library is next 
digested with Notl or EcoKl (NEB, Cat# R0101T, 100 U/fA) and trans- 
formed into the corresponding haploid-convertible heterozygous knockout 
diploid mutant. The mutant alleles will be integrated into the endogenous 
locus via homologous recombination. 

1. Digest 40 fig of library plasmid DNA (over night at 37 °C) in a total 
volume of 400 fA using either of the two restriction enzymes mentioned 
above. 

2. Reduce the reaction volume to 100 fA using a speed- vac. 

3. Use the digested library DNA to transform the corresponding heterozy- 
gous diploid strain. Split the digested DNA to two 50 fA aliquots, and use 
each of them to prepare a separate transformation suspension as already 
described (Pan et ah, 2004). 

4. Combine the two transformation suspensions (~ 1284 fA) and plate 1 fA 
onto a SC— Ura plate to estimate the transformation yield. 

5. Transfer the rest of the transformation suspension to 50 ml liquid 
SC— Ura and incubate at 30 °C for 2 days to amplify the yeast library. 

6. Count the number of colonies on the 1 fA plate and multiply by 
the microliters inoculated into the SC— Ura to estimate the total trans- 
formation yield. This number should be no less than 15,000. Otherwise, 
repeat yeast transformation procedure. 
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Figure 8.5 Testing the Topo-TA cloning efficiency in SB221. Unless otherwise 
stated, all the symbols are as in Fig. 3. (A) The SB221 vector is made up of the URA3 
gene flanked by a fragment containing the TEF promoter plus half of the KanMX gene 
(Kan), and another fragment containing the other half of the KanMX gene and the TEF 
terminator (Mx). This 2.7 kb fragment can be excised from the 2.5 kb backbone by 
restriction digestion with Noil. (B) Restriction digest with Notl following a successful 
Topo-TA cloning reaction is indicated by the 2.5 kb backbone and a band shift of the 
2.7 kb fragment, in accordance with the insert size. 

3.3.4. Yeast speculation 

Amplified yeast library is then sporulated to convert the heterozygous 
diploid into haploid spores similarly as described in Section 2. However, a 
slightly different haploid selection medium, SC— Ura— Leu— His— Arg+Can, 
is used to determine the efficiency of producing haploid MATa. Ura cells 
from the sporulation culture. 



3.3.5. Screening for Ts alleles 

After the plating efficiency is determined, typically ~ 6000 haploid MA Ta 
Ura colonies are screened for candidate Ts alleles for each gene. 

1. Spread the sporulation culture on 15 plates of SC— Ura— Leu— His— 
Arg+Can at ~400 colonies per plate. 

2. Incubate these plates at 25 °C for 3—4 days. 

3. Replica-plate colonies on each plate to a fresh plate of the same haploid 
selection media and mark the orientation of both the mother and the 
daughter plates. 
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4. Incubate the mother plates at 25 °C and the daughter plates at 37 °C. 

5. Assess the Ts phenotype of each clone by comparing its growth on both 
mother and daughter plates on the next day. 

3.3.6. Confirming Ts phenotypes 

Restreak the Candidate Ts mutants onto two haploid selection media plates, 
and incubate at 25 and 37 °C. Once the Ts phenotype is confirmed, 
backcross the MATa. Ura Ts candidate to a wild-type BY47 '42-ade2 101- 
NatMX MATa strain. In this strain, the NatMX gene (which provides 
resistance to the drug ClonNAT) is linked to the ade2-101 ochre mutation. 
Select the diploid cells by streaking onto SC—Ura+ ClonNAT media, and 
then subjected to sporulation and dissection. Replicate the dissected tetrads 
onto YPD (25 and 37 °C), SC-Ura (25 °C), and YPD supplemented with 
G418 (25 °C). This confirms that: (1) the temperature sensitivity segregates 
in a Mendelian manner (2:2), and indicates that the Ts phenotype depends 
on a single mutated gene, (2) the Ts phenotype is linked to the URA3 gene, 
and therefore cosegregates with the mutated PCR product, (3) the muta- 
genized PCR product was integrated at the correct genomic locus render- 
ing the cells G418 sensitive (5' f kanMX::yfg-ts-URA3::3' 'kanMX) . 

3.3.7. Allele transfer-in 

This is carried out similarly as described above but with subtle modi- 
fications. A preexisting Ts allele (or other gene construct) is first PCR- 
amplified with the appropriate plasmid or genomic DNA as the template 
using a proofreading-competent polymerase (e.g., LA Taq DNA polymerase) 
that generate an "A" overhang on each 3 7 end of the PCR product. The 
PCR products are cloned into SB221 using Topo-TA cloning. The cloned 
PCR products are then released together with the URA3 marker and the 
KanMXS' and KanMX?)' fragments from the vector backbone and trans- 
formed into the corresponding haplo id-convertible hetero2ygous diploid 
mutant. Ura yeast transformants are sporulated as a population and plated 
on SC— Ura— Leu— His— Arg+Can to select for haploid MATa Ura cells at an 
appropriate colony density. Single colonies are then tested for the phenotype 
of interest (e.g., temperature sensitivity). Candidate clones are then back- 
crossed to a wild-type MATot strain and analyzed with tetrad dissection to 
further confirm the phenotype. 

3.3.8. Allele transfer-out 

Here, we describe how to transfer a Ts allele generated by the diploid shuffle 
method to any other ura3 strain background. Unless otherwise stated, 
methodologies are as mentioned above. 

1. Design PCR primers that will amplify the whole 5' ' kanMX: :yfg-ts - 
URA3v3' kanMX cassette. These primers are specific to the 5' and 3 7 
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regions of YFG in the recipient strain. They should also be external to 
the two primers originally used to generate the Ts allele in the donor 
strain (Fig. 8. 4 A). 

2. Set up a PCR using genomic DNA of the Ts mutant as DNA template. 

3. Transform the PCR product into your strain of interest and select for 
Ura transformants. 

4. If the recipient strain is haploid, screen for colonies with a Ts phenotype. 

5. Backcross to a wild-type strain of the opposite mating type. 

6. Verify that the Ts phenotype is linked to the Ura by tetrad analysis. 

The recipient strain can also be diploid. In this case, Ura transformants 
are selected after Step 3, sporulated, and characterized by tetrad analysis to 
ensure that the Ts and Ura phenotypes cosegregate. This will also allow 
testing whether the Ts allele is viable in the particular genetic context of the 
recipient strain. It is also possible that this allele is no longer Ts in this strain 
background. If so, representative Ura transformants will need to be char- 
acterized with diagnostic PCR or sequencing to ensure that the 5'kanMXw 
yfg-ts-URA3::3 f kanMX cassette is indeed integrated at the right locus. 




4. Perspectives 



The two variations of the "diploid shuffle" are both highly efficient 
methods for making Ts mutants. It is essentially always possible, by screen- 
ing enough mutants using the methods outlined here, to find such alleles. 
An adaptation of the methods outlined here will be required for making Ts 
mutants in very large essential genes (in this case, the gene would be 
mutagenized in sections). The relative advantages of the methods described 
here are summarized in Table 8.4. 

We have chosen to move forward with the chromosomal method for 
the generation of a genome-wide collection of Ts mutants as a community 
resource because it was felt that most users would prefer integrated copies 
that would not fluctuate or be lost from a subpopulation of cells at each 
division due to their being on an episome. 



Table 8.4 Advantages of the two methods 



Plasmid 

Faster/easier 
No special 
reagents 
needed 



Chromosomal 

Mutant is single copy 
Mutant is stable (not 
episomal) 



Both 

Works for any essential gene 
Mutants are tagged with 
molecular barcodes assigned 
to original knockout 
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The ability to generate a genome-wide collection compares favorably 
with other attempts to generate genome-wide resources for the study 
of essential genes, such as Tet-regulated alleles (Hartman et ah, 2001; 
Mnaimneh et al, 2004) and dAMP (Schuldiner et al, 2005). Both of 
those approaches, while having their distinct advantages, only produced 
well-behaved alleles in about 30% of the cases. Disadvantages of Ts mutants 
include the fact that they are not uniform with regard to nonpermissive 
temperature and leakiness, and the fact that the needed temperature shifts 
may induce heat shocks or other side effects that could potentially cloud 
phenotypic analyses. Nevertheless these alleles have been the bastion of 
traditional genetic analyses of essential genes. The Ts alleles we have 
sequenced include a mix of single amino acid substitutions and multi 
amino acid substitutions; however, we have not sequenced enough of 
these to develop extensive statistics on this. Studies of collections of Ts 
mutants in a variety of genes have allowed the empirical determination of 
their typical characteristics. On this basis, a scheme for predicting Ts 
mutants has been developed (P. Ye, J. Dymond, X. Shi, Y.-Y. Lin, 
X. Pan, J. D. Boeke, and J. S. Bader, submitted for publication). This is 
likely to prove very useful for designing Ts mutants in organisms in which 
extensive screening is impractical. 
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Abstract 

Genetic interactions represent the degree to which the presence of one 
mutation modulates the phenotype of a second mutation. In recent years, 
approaches for measuring genetic interactions systematically and quantita- 
tively have proven to be effective tools for unbiased characterization of gene 
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function and have provided valuable data for analyses of evolution. Here, we 
present protocols for systematic measurement of genetic interactions with 
respect to organismal growth rate for two yeast species. 




1. Introduction 

Genetic interactions, which represent the modulation of the pheno- 
type of one mutation by the presence of a second mutation, have long been 
used as a tool to dissect the functional relationships among sets of genes 
(Guarente, 1993; Kaiser and Schekman, 1990). Classically, researchers have 
looked for strong qualitative differences between observed phenotypes of 
double mutants and the phenotypes of the two related single mutants. For 
example, a relationship referred to as synthetic lethality is observed when 
two mutations are not lethal when present individually but, when combined, 
result in an inviable organism. Synthetic sick/lethal, or negative, interactions 
have been used as evidence that two genes act in independent but comple- 
mentary pathways. For example, strong negative genetic interactions exist 
between two key machines working in parallel pathways involved in chro- 
matin assembly, the HIR complex (Hirl, Hir2, Hir3, and Hpc2) and the 
CAF complex (Msil, Cac2, and Rlf2) (Collins et ah, 2007a; Loyola and 
Almouzni, 2004) (Fig. 9. IB). On the other hand, if a normally deleterious 
mutation has no effect in the context of a second mutation, this is referred to 
as a positive genetic interaction, and often it identifies genes that act in the 
same pathway. Indeed, positive genetic interactions exist between members 
of the HIR complex as well as between the components that comprise the 
CAF complex (Collins et ah, 2007a) (Fig. 9. IB). These two classes of 
interaction have been extremely useful for deciphering the organization of 
molecular pathways in model organisms, but in fact they represent only two 
special cases within a much larger spectrum of interactions (Fig. 9.1A). 

Recent advances in technology now make it possible to measure large 
numbers of genetic interactions systematically and in parallel in yeast (Pan et ah , 
2004; Roguev et ah, 2007; Tong et ah, 2001), and it is possible to make these 
measurements quantitatively (Decourty et ah, 2008; Roguev et ah, 2008; 
Schuldiner et ah, 2005). Synthetic growth defects of varying magnitudes as 
well as a broad range of suppressive and masking interactions can be detected. 
Importantly, the ability to make these measurements in a high-throughput 
fashion also provides a novel and valuable context for understanding quantita- 
tive genetic interactions. Work has demonstrated that the pattern of interac- 
tions (represented as a mathematical vector) for a given mutation can be used as 
a multidimensional phenotype. These patterns can be compared, and sets of 
genes producing similar patterns (and hence sets that are functionally closely 
related) can be accurately identified (Schuldiner et ah, 2005; Tong et ah, 2004; 
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Figure 9.1 Epistatic interactions within and between chromatin assembly complexes. 
(A) The entire spectrum of genetic interactions. Quantitative genetic analysis can 
identify negative ((aAbA) < (aA)(bA)), positive ((aAbA) > (aA)(bA)), and neutral 
((aAbA) = (aA)(bA)) genetic interactions. (B) Genetic interactions between and within 
the HIR and CAF chromatin assembly complexes. Using the E-MAP approach (Collins 
et ah, 2007a), strong negative interactions were detected between components of the 
HIR-C (HIRl, HPC2, HIR3, and HIR2) and the CAF-C (MSI1, CAC2, and RLF2), 
which are known to function in parallel pathways to ensure efficient chromatin assem- 
bly. Conversely, positive genetic interactions were observed between components 
within each complex. Blue and yellow interactions correspond to negative and positive 
genetic interactions, respectively. (C) Plot of correlation coefficients generated from 
comparison of the genetic profiles from hirlA and hir2A to all other ~750 profiles from 
the chromosome biology E-MAP (Collins et ah, 2007a). Note the high pairwise 
correlations with HIRl, HIR2, HIR3, and HPC2. 



Ye et ah, 2005). For example, the patterns of interactions for HIR complex 
gene deletions are more strongly correlated to each other than to the patterns 
for other genes (Collins et ah, 2007a) (Fig. 9.1C). 

We designed the epistatic miniarray profile (E-MAP) approach focusing 
on two key strategies to maximize the value of high- throughput genetic 
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interaction measurements. In this approach, quantitative genetic interac- 
tions are measured (using a simple growth phenotype) systematically 
between all pairwise combinations of 400— 800 rationally chosen mutations. 
The first strategy is measuring interactions quantitatively, which allows the 
detection and analysis of the complete spectrum of interaction strengths. 
We have found that this both improves our ability to use patterns of genetic 
interactions to identify sets of genes acting in a common pathway (Collins 
et ah, 2006), and that positive interactions (i.e., where a double mutant is 
fitter than expected) are particularly useful for making conjectures about 
gene function. Second, we aim to measure all pairwise interactions among a 
rationally chosen set of genes. This approach has several advantages. It 
increases the signal-to-noise ratio because the frequency of genetic interac- 
tions is higher between genes acting in related pathways (thus signal 
increases while noise is constant). It also provides a richer set of patterns 
for analysis. For example, having data for the components of all known 
DNA damage repair genes in an organism allows a researcher not only to 
classify a newly identified gene as a damage repair component, but also to 
assess which (if any) of the known repair pathways the new gene is most 
likely closely involved in. Importantly, with such a strategy, expansion is 
easy. New mutations of interest can rapidly be screened, and the results can 
be readily compared to and merged with the existing dataset. 

Genetic interactions are measured not only in terms of growth or 
viability, but also can be (and have been) derived from other phenotypes 
(Jonikas et ah, 2009). In general, we can define a genetic interaction (£ab) 
between mutations A and B in terms of any quantitative phenotype P as 
the difference between the observed phenotype of the double mutant 
(Pab, observed) an d the expected phenotype of the double mutant (Pab, expected) 
if no interaction exists between the two mutations: 

c rjobserved ^expected 

*>AB — i AB 1 AB 

This formulation clearly depends on our ability to compute Pab, expected 
as a function of P A , bserved an d Pb, observed- A theoretically derivable form for 
P A b, expected does not necessarily exist. However, a practically useful rule is 
that Pab, expected should account for the typical combined effect of two 
individual mutations with phenotypes PA,observed an d ?b, observed- Since strong 
genetic interactions are rare (Pan et ah, 2004; Schuldiner et ah, 2005; Tong 
et ah , 2004), they manifest themselves as outliers that deviate from the surface 
that describes the broad trends in the majority of double-mutant data 
(Fig. 9.2). With this motivation, we define Pab, expected empirically to repre- 
sent this typical combined effect. Additionally, if a simple functional form 
(e.g., the product PA,observed x ?b, observed) captures accurately the empirical 
relationship, this can be an extremely useful simplification. 
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Figure 9.2 Genetic interactions as deviations from expected double-mutant pheno- 
types. An idealized smooth surface is shown to represent the expected combined effects 
of independent mutations. The surface shown is not based on real data, but is intended to 
serve as an abstract example. The x- and y-axes represent single-mutant phenotypes, 
scaled between and 1 . The height of the surface (along the z-axis) represents the 
corresponding expected double-mutant phenotype. This surface should accurately 
describe the empirical typical double-mutant phenotypes, and it should be symmetric 
about the line y = x. The grey spheres represent observations for specific double mutants. 
The quantitative interaction is represented by the vertical distance from the point to the 
surface. As both of these points lie above the surface, they represent positive interactions. 

In this chapter, we describe in detail a method for measuring quantitative 
genetic interactions based on the area of yeast colonies. We have used this 
strategy (with some technical differences) for both Saccharomyces cerevisiae 
and Schizosaccharomyces pombe (Roguev et ah, 2007; Schuldiner et ah, 2006), 
and we provide here the details for each protocol. We also give a brief 
description of how analysis of the resulting data can be used to generate 
specific biological hypotheses. 




2. Selection of Mutations for Genetic Analysis 



Ultimately, comprehensive genetic interaction maps in both budding and 
fission yeasts may be generated where every possible pairwise double mutant 
is created and phenotypically assessed. Completion of such comprehensive 
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genome- wide maps will represent a major accomplishment; however, it will 
still be some time until this is achieved. Additionally, our observations so far 
indicate that such maps will consist primarily of neutral genetic interactions. 
As mentioned above, the frequency of strong negative or positive genetic 
interactions between randomly chosen gene pairs has been estimated to be as 
little as 0.5%, but interactions are much more frequent between functionally 
related pairs of genes (Schuldiner et al, 2005; Tong et al, 2004). The E-MAP 
approach was devised to take advantage of this fact by specifically targeting 
genes that are likely to be functionally related. There are various ways by 
which sets of mutations can be selected for high-density, quantitative genetic 
interaction screening, and which is best strongly depends on the biological 
process that one wishes to interrogate, and the types of answers one would 
like to uncover. Genes can be chosen with the goal of identifying missing 
links within and between known pathways (e.g., what genes control the 
deposition of variant histones?), and they can also be chosen to address 
broader questions (to what extent are genetic interactions conserved during 
evolution?). We provide here an overview of strategies that have been used in 
the past. 

The first E-MAP focused on approximately 400 genes likely to be 
involved in the early secretory pathway. In this case, gene selection was 
based largely on the localization of the corresponding proteins to either the 
Golgi apparatus or the endoplasmic reticulum (Schuldiner et ah, 2005). The 
rationale was that proteins residing in a common subcellular compartment 
are more likely to be functionally related. Of course, gene selection solely 
based on localization would miss important factors acting in an indirect 
fashion. For example, signaling proteins such as kinases may mediate strong 
effects on spatially distant processes. 

Another recently generated E-MAP focused on factors involved in 
various aspects of chromosome function, including transcription, DNA 
repair/replication, and chromosome segregation (Collins et al, 2007a). 
For this study, protein— protein interaction datasets were used as a primary 
source for gene selection. Using information from systematic affinity tag/ 
purification mass spectrometry experiments (Collins et al, 2007b; Gavin 
et ah, 2006; Krogan et al, 2006), the genes whose corresponding proteins 
were contained within one or more complexes involved in the chromatin- 
related functions were targeted. Once again, this method alone would miss 
factors indirectly impinging on the processes. 

Sets of genes have also been targeted based on the presence of character- 
ized domains (e.g., kinase domain, SET domain, etc.) or their likely molec- 
ular activity. For instance, an E-MAP was generated that comprised all 
kinases and phosphatases (both protein and nonprotein), regulatory subunits 
for these enzymes, phospho-binding proteins, and many factors known to 
be phosphorylated (Fiedler et al, 2009). The genes in this E-MAP impinge 
on all processes in the cell and therefore this approach allowed for a global 
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view of the genetic architecture of the signaling apparatus instead of high- 
density information on one specific biological process. This work revealed 
an enrichment of positive genetic interactions between kinases, phospha- 
tases, and their corresponding substrates (Fiedler et ah, 2009) which would 
have been difficult to assess without a broad survey of these protein classes. 

Finally, genes can be selected based on global, unbiased phenotypic 
screens. A recent study began with a genome-wide screen for mutations 
that affect the activity of the unfolded protein response (UPR) pathway 
(Jonikas et ah, 2009). Then, approximately 400 mutations that modulate 
basal UPR activity were selected and further characterized using systematic 
double-mutant analysis. Of course, gene selection in this fashion requires 
single mutants to have a measurable effect on the process in question, 
whereas multiple mutations may be required to see an effect when com- 
pensatory pathways exist. 

In general, a combination of the approaches described above is likely to 
be most effective. For example, in the final chromosome function E-MAP 
(Collins et ah, 2007a), selection was not only based on composition of 
protein complexes but also on genome-wide genetic interaction screens 
using mutations of genes known to be integrally involved in processes of 
interest. In this way, a number of genes not associated with a known 
chromatin remodeling complex and not yet functionally annotated could 
be included in the analysis. Using this approach, we included the previously 
uncharacterized protein Rttl09 in the genetic analysis. Based on the genetic 
interaction profiling, we were able to link its function to the chromatin 
assembly protein Asfl and identify it as the founding member of a new 
family of histone acetyltransferases responsible for K56 histone H3 acetyla- 
tion (Collins et ah, 2007a; Driscoll et ah, 2007; Han et ah, 2007). 

Recently, this analysis was extended to another, evolutionarily distant 
organism (S. pombe). Selection of genes for analysis was guided by the same 
criteria. Additionally, mapping of direct orthologs between the two organ- 
isms made it possible to observe some of the general trends in genetic 
interaction network evolution (Roguev et ah, 2007 ', 2008). 




3. Generation and Measurement of Double 
Mutant Strains 

Here, we describe protocols for screening in both S. cerevisiae (SGA, 
synthetic genetic tfrray) (Collins et ah, 2006; Schuldiner et ah, 2006; Tong 
and Boone, 2006) and S. pombe (PEM, pombe epistatic mapper) (Roguev 
et ah, 2007). A flowchart comparing the two protocols is presented in 
Fig. 9.3. Both methodologies are, in essence, high-throughput procedures 
for random spore analysis where the growth of double mutant cell pools is 
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SGA (S. cerevisiae) 

MAT a, SGA markers MAT a, AY::G418 R 

AX::NAT R 




Mating 
(YPAD) 

1 day 



V 

Diploid selection 

(YPAD + G + N) 

2 days 



Meiosis and sporulation 
sporulation medium (SPO) 

5 days 



V 

HS and MTS (HS1) 

(SD + can + S -AEC) 

2 days 



V 

HS and MTS (HS2) 
(SD + can + S-AEC) 

1 day 



HS, MTS, and SMS (SMI) 
(SD + can + S-AEC + G) 

1 day 



V 

DMS 

(SD + can + S-AEC + G + N) 

1 day 

13 days, 9 plates 



PEM-2 (S. pombe) 

h- PEM-2 markers h+, AY::G418 R 

AX::NAT R 




Mating, meiosis, and sporulation 

sporulation medium (SPAS) 

5 days 



HS, MTS, and SMS (GC1) 
(YE5S+G + C) 

3 days 



HS, MTS, and SMS (GC2) 
(YE5S+G + C) 

2 days 



V 

DMS 

(YE5S + G + N + C) 
1 day 

1 1 days, 6 plates 



HS - haploid selection 
MTS - mating type selection 
SMS - single-mutant selection 
DMS - double-mutant selection 



Figure 9.3 Overview of the experimental protocol. Flow charts outlining the series of 
selections used in S. cerevisiae E-MAP screens (left) and S. pombe PEM screens (right) are 
presented. 



monitored on agar using high-density cell arrays. The same logic is followed 
in both model organisms. In each screen, a query strain containing one 
NAT-marked mutation is crossed to an array of strains carrying G418- 
marked mutations. An array of diploid strains is generated by mating and a 
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change of growth media is used to induce meiosis and sporulation. A series 
of selections are then used to select for haploid double-mutant cells of a 
particular mating type carrying both mutations. The growth phenotypes of 
the resulting double mutants are assessed by measuring colony area after a 
defined period of time. 

The two protocols differ in several important technical details due to 
biological differences between the two organisms. In 5. cerevisiae, the mating 
step occurs on rich medium and nitrogen starvation is used to induce 
meiosis and sporulation. However, in S. pombe, the entire sequence (mating, 
meiosis, and sporulation) is induced by limited nitrogen, allowing the whole 
process to be carried out in a single step. Unlike in S. pombe, the diploid 
phase in S. cerevisiae is very stable, so an additional step enriching for diploids 
(diploid selection) is usually included in the SGA screen. In both systems, 
after the spores are germinated, the remaining diploid cells are killed (during 
haploid selection steps, HS), and only haploid cells of one mating type are 
allowed to grow (mating type selection, MTS). Importantly, one marker 
used in this selection comes from the parent strain of the opposite mating 
type, thus requiring that only haploid progeny from the initial mating can 
pass the selection. HS in both systems and MTS in S. pombe take advantage 
of recessive selectable markers. In S. cerevisiae, canavanine and S-AEC are 
used to select against the parent diploid cells, which are heterozygous for the 
CAN1 and LYP1 genes encoding transporters needed for import of these 
toxic compounds. A mating type-specific promoter driving the transcrip- 
tion of a conditionally essential metabolic gene (HIS3) is used for MTS. 
In S. pombe, a recessive allele providing resistance to cycloheximide has been 
engineered to perform both selections using a single marker (Roguev et ah, 
2007). In both the SGA and PEM protocols, two rounds of HS and MTS 
are carried out. In S. cerevisiae, an additional single mutant selection (SMS) 
step enriching for single and double mutant haploids is performed. In the 
PEM protocol, the same media is used for HS, MTS, and SMS. Finally, a 
double-mutant selection (DMS) is performed, the arrays are photographed 
and the images analyzed. 

All selections before the final DMS step (HS, MTS, and SMS) let single 
and double mutants compete within the same cell mixture. These competi- 
tion steps greatly improve the sensitivity and dynamic range of the method 
by allowing the detection of subtle synthetic growth defects. The final 
colony sizes measured after growth on the DMS plates reflect both the 
growth rate of the double mutant cells during this final stage, as well as the 
fraction of cells deposited on the final plate which has the right genotype 
(both NAT- and KAN-marked mutations). This latter quantity is largely 
determined by competitive growth during earlier steps. 

In planning a set of screens, one must decide how many replicate mea- 
surements will be sufficient to generate a high-quality dataset. By reanalyzing 
our earlier published data (Schuldiner et ah, 2005), we find that the first 
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three independent replicate measurements give substantial improvements in 
the precision of the measured average colony size (Fig. 9.4B). Additional 
replicates beyond three improve data quality, but it may be worth sacrificing 
these marginal gains in exchange for the ability to complete more screens at 
lower cost. 

Detailed SGA and PEM protocols using a Singer RoToR pinning 
station are given below. The protocols below may also be executed with 
other types of pinning devices such as hand pinning tools or alternate 
robotic pinners (Schuldiner et ah, 2006). If hand-pinning tools are used, a 
pinner with small diameter pins (e.g., VP384FP4 from V&P Scientific, 
San Diego, CA) should be used for the last step. The length of incubation 
times may need to be adjusted slightly for other pinning systems. We have 
used two days rather than one for the diploid selection and for the SMS 
steps for S. cerevisiae screens in 768 colony format with hand pinning tools. 
In our experience, results obtained using the Singer RoToR have better 
signal-to-noise than those obtained with hand-pinning devices. However, 
satisfactory results can be obtained with either method. 

3.1. Basic SGA protocol 

3.1.1. Genotypes 

Query: MATa; his3Al; leu2A0; ura3A0; LYS2+; canl::STE2pr-SpHIS5 

(SpHIS5 is the S. pombe HIS5 gene); lyplA::STE3pr-LEU2; XXX:: 

NatMX 

Library: MATa; his3Al; leu2A0; ura3A0; metl5A0; LYS2+; CAN1+; 

LYP1+; YYY::KanMX 

3.1.2. Growth media (all recipes for 1 I of media) 

YPAD (YEPD + adenine) — Mix 10 g yeast extract, 20 g peptone, 120 mg 

adenine, 20 g Difco Agar, and DDW up to a final volume of 1 1. Autoclave. 

Add 50 ml of sterile 40% glucose. 

SPO (sporulation media: NGS Agar) — Mix 20 g Difco Agar and 820 ml 

DDW in one flask. Mix 0.5 g — ura-trp amino acid powder mix (Sunrise 

Science #1010-100), 2.5 ml 20 mM uracil, 2.5 ml 20 mM tryptophan, and 

163 ml DDW in a second flask. Autoclave each flask separately. Mix the 

two flasks and add 20 ml of 500 mg/ ml filter sterilized potassium acetate. 

Note: do not autoclave the potassium acetate! 

SD-HIS-LYS-ARG (for haploid selections)— Mix 20 g Difco Agar and 

850 ml DDW in one flask. Mix 6.7 g yeast nitrogen base without amino 

acids, 2 g amino acid drop out mix (recipe below), and 100 ml DDW in a 

second flask. Autoclave both flasks. Mix the flasks and add 50 ml of 40% 

glucose. 

SD(MSG)— HIS— LYS— ARG (for single- and double-mutant selections) — 

Mix 20 g Difco Agar and 850 ml DDW in one flask. Mix 1.7 g yeast 
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Figure 9.4 Overview of the data processing procedure. (A) A flowchart describing the data processing procedure for a single screen is 
shown. The first images are digital photographs of arrays of yeast colonies. In the following images heatmaps of either measured colony sizes 
or genetic interaction scores are shown. In colony size heatmaps, blue represents small colonies, black represents average-sized colonies, and 
yellow represents large colonies. In the genetic interaction heatmap, blue represents negative interactions, black represents neutral 
interactions, yellow represents positive interactions, and gray represents missing data (or data filtered out during quality control). 
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nitrogen base without amino acids and without ammonium sulfate (Becton, 
Dickinson and Company #233520), 2 g amino acid drop out mix (recipe 
below), 1 g monosodium glutamatic acid, and 100 ml DDW in a second 
flask. Autoclave the first flask and filter sterilize the second. Mix the two 
flasks together and add 50 ml 40% glucose. 

Amino acid drop out mix: Mix 3 g adenine, 10 g leucine, 0.2 g para- 
aminobenzoic acid, and 2 g each of alanine, asparagine, aspartic acid, 
cysteine, glutamine, glutamic acid, glycine, inositol, isoleucine, methio- 
nine, phenylalanine, proline, serine, threonine, tryptophan, tyrosine, uracil, 
and valine. 

Note: Be sure to pour level plates, as this is very important for the effective- 
ness of the robotic pinning steps. 



3.1.3. Drug concentrations 

NAT (N)— 100 mg/1; G418 (G)— 100 mg/1; S-AEC (S)— 50 mg/1; cana- 

vanine (c) — 50 mg/1 

Note: for adding labile compounds (NAT, G418, S-AEC, and canavanine), 

the media should be cooled first until is hot, but no longer painful to touch 

the container. 

Note: previous protocols used 200 mg/1 NAT and G418, but we have found 

100 mg/1 to be sufficient. 



3.1.4. Plates nomenclature 

YPAD = YPAD; SPO = SPO; Diploid = YPAD + N + G; NAT = 
YPAD + N; G418 = YPAD + G; HS = SD-HIS-LYS-ARG + S + c; 
SM = SD-HIS-LYS-ARG + S + c + G; DM = SD-HIS-LYS-ARG + 

S + c + G + N 



(B) The variability in measured growth phenotype (mean colony size over the replicate 
measurements) is shown as a function of number of experimental replicates. The curves 
shown were generated using data from 36 replicates of a control screen run while 
generating the early secretory pathway E-MAP (Schuldiner et ah, 2005). On a given 
curve, the point corresponding to iV replicates was generated by randomly drawing AT of 
the 36 replicates for a particular strain and computing the mean normalized colony size. 
This process was repeated 1000 times, and the standard deviation over these 1000 
repeats was plotted. Each curve represents data for a different strain. Five different 
representative strains with different levels of measurement variability were chosen for 
analysis. (C) The variability of the empirically determined expected double-mutant 
phenotype was estimated as a function of the number of screens analyzed in parallel. 
In this case, sets of screens of the indicated size were drawn at random from a set of 329 
screens completed at approximately the same time from the early secretory pathway 
E-MAP (Schuldiner et al, 2005). For each point on each curve, 1000 random draws 
were completed, and each time expected colony size values were computed. The 
standard deviation of the expected colony sizes (over the 1000 random draws) was 
plotted for four representative strains with different single mutant phenotypes. 
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HS1 and HS2 are HS plates used in two consecutive steps of the 
protocol. 

3.1.5. Procedure 

3.1.5.1. Preparation of query lawn 

A lawn of cells of the query strain is prepared for use in the mating step. 
Inoculate a liquid culture in YEPD from a single colony of the query strain 
(from a YEPD + NAT plate) and grow to saturation overnight. 
Prepare a lawn of cells — Spread up to 500 jA of thick culture onto a NAT 
plate using glass beads and incubate at 30 °C for 2 days. 

3.1.5.2. Preparation of library arrays (or "T-arrays" for "Target 
arrays") in 1536 format 

The library array is replicated for use in the mating step. 

Program: Agar- Agar. Replicate. Replicate Many. 1536 

Parameters: Source plate: G418; Target plate: G418; Source pressure: 100%; 

Target pressure: 100%; Offset: Manual; Number of replicas: 3; Economy: 

ON; Revisit source: ON 

Note: More than three replicas/source at this density may not be consistent 

and slow growing mutants (e.g., small colonies) may be lost. Incubate at 

30 °C for 1 day. 

3.1.5.3. Mating 

Query and array cells are pinned on top of each other on a fresh plate for mating. First 
pin the T-array twice and then pin the lawn twice on top of it. 
Program: Agar- Agar. Replicate. 1536 

Parameters: Source plate: T-array and lawn; Target plate: YPAD; Source 
pressure: 100%; Target pressure: 100%; Offset: Manual; Economy: OFF 
Incubate at 30 °C for 1 day. 

3.1.5.4. Diploid selection 

Cells are pinned from the mating plate and diploids are selected by using NAT and 

G418. 

Program: Agar- Agar. Replicate. 1536 

Parameters: Source plate: mating; Target plate: Diploid; Source pressure: 

100%; Target pressure: 100%; Offset: Manual; Economy: OFF 

Incubate at 30 °C for 1 day. 

3.1.5.5. Sporulation 

Replicate the arrays from Diploid onto SPO plates using 384 pads. 

Do 2 pins per array onto the same target GC plate. This is the most critical 

stage of the protocol and transferring enough cells on the target plate is very 

important. 
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Program: Agar-Agar. Replicate. Replicate One. 384 

Parameters: Source plate: Diploid; Target plate: SPO; Source pressure: 

100%; Target pressure: 100%; Offset: OFF; Economy: OFF 

Incubate for 5 days at room temperature in a humid environment. 

Note: When loading the pads click "Modify" to change to 384 pads. 

Note: The plates should be packed in plastic bags and kept in a humid 

environment to prevent drying. 

3.1.5.6. HSi 

Replicate the arrays from the SPO plates onto HS plates using 384 pads. Do 

2 pins per array. The cells do not divide on the SPO media, so maximizing 

cell transfer at this step and at the previous step is critical. 

Program: Agar-Agar. Replicate. Replicate One. 384 

Parameters: Source plate: SPO; Target plate: HS (HSI); Source pressure: 

100%; Target pressure: 100%; Offset: Automatic 

Note: Incubate for 2 days at 30 °C. 



3A.5./. HS2 

Replicate the arrays from the HSI plates onto HS2 plates using 1536 pads. 

Do 1 pin per array. 

Program: Agar- Agar. Replicate. Replicate One. 1536 

Parameters: Source plate: HSI; Target plate: HS (HS2); Source pressure: 

100%; Target pressure: 100%; Offset: Automatic 

Note: Incubate for 1 day at 30 °C. 



3.1.5.8. SM 

Replicate the arrays from the HS2 plates onto SM plates using 1536 pads. 

Do 1 pin per array. 

Program: Agar- Agar. Replicate. Replicate One. 1536 

Parameters: Source plate: HSI; Target plate: HS (HS2); Source pressure: 

100%; Target pressure: 100%; Offset: Automatic 

Note: Incubate for 1 day at 30 °C. 



3.1.5.9. DM 

Replicate the arrays from the SM plates onto DM plates using 1536 pads. 

Do 1 pin per array. 

Program: Agar- Agar. Replicate One. 1536 

Parameters: Source plate: SM; Target plate: DM; Source pressure: 100%; 

Target pressure: 100%; Offset: Automatic 

Incubate at 30 °C. 

Take pictures of the DM after 48 h. 

Store the final plates in coldroom. 
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3.2. Basic PEM protocol 

3.2.1. Genotypes 

Query: h-; ade6-M210; leul-32; ura4-D18; matl_m-cyhS, smtO; rpl42:: 

cyhR (sP56Q); XXX::NatMX6 

Library: h+; ade-M210 (or M216); ura4-D18; leul-32; YYY::KanMX6 

3.2.2. Growth media 

YE5S (general purpose rich media) — 5 g/1 yeast extract; 30 g/1 glucose; 225 
mg/1 of each adenine, leucine, histidine, uracil, and lysine; 20 g/1 Difco 
Agar 

SPAS (mating and sporulation media) — 10 g/1 glucose; 1 g/1 KH 2 P0 4 
45 mg/1 of each adenine, histidine, leucine, uracil, and lysine hydrochlo- 
ride; 1 ml 1000 X vitamin stock (1 g/1 pantotenic acid; 10 g/1 nicotinic acid; 
10 g/1 inositol; 10 mg/1 biotin) 

3.2.3. Drug concentrations 

NAT (N)— 100 mg/1; G418 (G)— 100 mg/1; CYH (C)— 100 mg/1 

3.2.4. Plates nomenclature 

YE5S = YE5S; SPAS = SPAS; NAT = YE5S + N; G418 = YE5S + G; 
GC = YE5S + G + C; GNC = YE5S + G + N + C 

GC1 and GC2 are GC plates used in two consecutive steps of the 
protocol. 

3.2.5. Procedure 

3.2.5.1. Preparation of Query arrays (Q-arrays) in 1536 format 

Prepare a lawn of cells — Spread up to 500 jA of thick culture onto a NAT 

plate using glass beads and incubate at 30 °C for 2—3 days. 

Program: Agar- Agar. Replicate. Replicate One. 1536 

Parameters: Source plate: NAT; Target plate: NAT 

Source pressure: 100%; Target pressure: 100%; Offset: Manual; Offset 

radius: 1 mm 

Note: Do 2 pins per plate picking cells from different parts of the lawn plate. 

Incubate at 30 °C for 2—3 days. 

3.2.5.2. Preparation of library arrays (T-arrays) in 1536 format 

Program: Agar- Agar. Replicate. Replicate Many. 1536 
Parameters: Source plate: G418; Target plate: G418; Source pressure: 100%; 
Target pressure: 100%; Offset: Manual; Number of replicas: 3; Economy: 
ON; Revisit source: ON 
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Note: More than three replicas/source at this density may not be consistent 
and slow growing mutants (e.g., small colonies) may be lost. Incubate at 
30 °C for 2—3 days. 

3.2.5.3. Mating 

Combine the T-array and the Q-array onto a SPAS plate generating a 1536 
density mating array. First pin the T-array twice (2x) and then pin the 
Q-array twice on top of it with agar mixing. 
Program: Agar- Agar. Replicate. 1536 

Parameters: Source plate: T-array and Q-array; Target plate: SPAS; Source 
pressure: 100%; Target pressure: 100%; Offset: Manual; Economy: OFF 
Note: Incubate for 5—6 days at room temperature packing the plates in 
plastic bags to prevent drying. 

3.2.5.4. SPAS-GCi 

Replicate the mating arrays from SPAS onto GC plates using 384 pads. 

Do 2 pins per array onto the same target GC plate. You will need eight 

(Schuldiner et al., 2005) 384 pads per array. This is the most critical stage of the 

protocol and transferring enough cells on the target plate is very important. 

Program: Agar-Agar. Replicate. Replicate One. 1536 

Parameters: Source plate: SPAS; Target plate: GC; Source pressure: 100%; 

Target pressure: 100%; Offset: OFF; Economy: OFF 

Note: When loading the pads click "Modify" to change to 384 pads. 

Incubate for 3 days at 30 °C. 



3.2.5.5. GC1-GC2 

Replicate the arrays from the GC1 plates onto GC2 plates using 1536 pads. 

Do 1 pin per array. 

Program: Agar-Agar. Replicate. Replicate One. 1536 

Parameters: Source plate: GC (GC1); Target plate: GC (GC2); Source 

pressure: 100%; Target pressure: 100%; Offset: Automatic 

Note: Incubate for 2 days at 30 °C. 



3.2.5.6. GC2-GNC 

Replicate the arrays from the GC2 plates onto GNC plates using 1536 pads. 

Do 1 pin per array. 

Program: Agar- Agar. Replicate One. 1536 

Parameters: Source plate: GC (GC2); Target plate: GNC; Source pressure: 

100%; Target pressure: 100%; Offset: Automatic 

Note: Incubate at 30 °C. Take pictures of the GNC at 24, 36, and 48 h. 

Store the final plates in coldroom. 



Quantitative Genetic Interaction Mapping Using the E-MAP Approach 221 

3.3. Digital photography 

We take color digital photographs of the final plates using a Canon PowerShot 
S3 IS camera (6.0 megapixels) at a resolution of 180 dpi, focal length = 18.2 
and F-number = 8. Images are taken at a distance of 60 cm (24 in.) by 
mounting the camera on a KAISER camera stand (Germany). The position 
of the plate is fixed using a custom-made metal plate holder permanently 
bolted to the camera stand. The base of the camera stand is covered with 
black velvet to create a uniform black background for the images. Illumination 
is provided by two fixed luminescent lamps (25—30 W) outside of a 20 X 20 
in. nylon soft light tent which serves to even the illumination and prevent 
reflections. 




4. Data Processing and Computation of Scores 

The data from the screens is collected as digital photographs of 
arrays of yeast colonies. These images are converted into measures of 
interactions between mutations through a multistep computational process 
(Collins et ah, 2006). Colony areas are measured digitally using the HT 
Colony Grid Analyzer Java program (Collins et ah, 2006). The resulting 
sizes are then processed using a software toolbox we have developed for use 
with MATLAB. Both pieces of software are freely available for download 
(http://sourceforge.net/projects/ht-col-measurer/ and http://sourceforge. 
net/projects/emap-toolbox/). The computational steps for converting 
colony area measurements into genetic interaction scores is depicted in 
Fig. 9.4A and outlined in the following steps: 



4.1. Preprocessing and normalization 

A preprocessing and normalization step is used to correct for systematic 
artifacts (uneven image lighting, artifacts due to physical curvature of the 
agar surface on which the colonies grew, differences in growth time, etc.), 
and also to account for the growth phenotype of the query strain. 

Several types of systematic artifacts can arise in the data collection process 
that give rise to spatial patterning of measured colony sizes which is indepen- 
dent of the growth properties of the yeast strains in the experiment. For 
instance, uneven lighting can result in apparently larger colonies in areas with 
brighter light. Additionally, an uneven agar surface can result in deposition of 
a larger number of cells in vertically elevated areas of the agar surface, and 
deposition of fewer cells in lower areas. In our experience, this agar curvature 
artifact is much more pronounced using the Singer robot plastic pad-based 
cell deposition method rather than a floating pin-based method. 
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We correct for these artifacts using a spatial flattening of the colony sizes. 
Specifically, the colony size measurement at each position on one agar plate 
is compared to the median size at that position over all plates in the dataset to 
compute a log-ratio indicating whether that colony is larger or smaller than 
is typical. The resulting set of log-ratios is fit using robust linear regression 
(using MATLAB's robustfit function) to a second order surface (i.e., one of 
the form: z = Ax 2 + By 2 + Cxy -\- Dx + Ey + F). The resulting surface is 
then subtracted from the log-ratios to remove spatial artifacts, and the 
corrected colony sizes are recovered by exponentiating the result and 
multiplying by the original median size at the corresponding position. 
This correction requires only six parameters and is calculated using 384, 
768, or 1536 measurements, depending on the number of colonies on the 
plate, so it is unlikely to be strongly affected by real genetic interactions. 
Additionally, robust regression, rather than standard linear regression, is 
used to minimize the impact of individual real interactions on the calculated 
correction. Colony sizes of zero and the colonies on the edges of the plate 
are excluded from the correction calculation. In our experience, this cor- 
rection effectively removes spatial patterning, without compromising 
detection of interactions. The MATLAB toolbox contains a graphical 
user interface which displays heatmaps (similar to those seen in Fig. 9.4A) 
before and after the spatial flattening so that the success of this step can 
be assessed. 

A separate correction is applied for the colonies on the edges of the plate. 
For each edge row or column, the colony sizes are scaled such that the 
median size in that row or column is equal to the median size in the interior 
of the plate. This correction is applied because we have found that edge 
rows and columns can be systematically larger or smaller (usually larger) than 
other colonies on the plate due to proximity or distance from the physical 
edge of the agar. 

Finally, colony sizes are normalized to account for differences in the 
growth phenotype of the query mutation which is present in all double- 
mutant colonies on the same plate. We apply this normalization in addition 
to the above-described spatial flattening to account for the possibility that 
the query mutation may have more synthetic interactions (where double 
mutants grow more slowly than expected) than positive interactions (where 
double mutants grow faster), or vice versa. We do assume that most muta- 
tions in the array will have little or no growth defect, as is the case for gene 
deletions in yeast (Breslow et ah, 2008; Giaever et ah, 2002), and that most 
mutation pairs will be noninteracting. We, therefore, normalize the colony 
sizes according to the peak of the histogram of colony sizes on a given plate 
(this is the "Parzen" setting in the toolbox menu). Heatmap images showing 
the spatial pattern of colony size measurements at different stages of the 
normalization process can be seen in Fig. 9.4A. All of these preprocessing 
and normalization steps are implemented in the MATLAB toolbox. 
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4.2. Computing expected colony sizes 

After normalization, the growth phenotype of the query mutation has been 
accounted for, and we assume that differences in normalized colony size 
now result from either the growth phenotype of the array mutation or a 
genetic interaction. For a given array mutation (which is always present in 
the exact same spatial position within the array), we then empirically 
estimate the expected normalized colony size as the typical normalized 
colony size over a set of screens. If the number of screens is large (generally 
50 or more), we prefer to use the peak of the histogram of normalized sizes 
(the "Parzen" setting in the toolbox), similar to our normalization proce- 
dure. However, if the number of screens analyzed is smaller we have found 
that the median normalized size is a more robust estimate. 

We sometimes observe batch-to-batch variability, where the typical 
colony size estimated for one group of screens completed at approximately 
the same time using the same preparation of media differ from the values 
estimated for another group of screens completed at a different time 
(perhaps weeks or months apart). If such batch-to-batch variability is 
apparent, it is preferable to compute the expected colony sizes indepen- 
dently for each batch. This can be done easily with options included in the 
MATLAB toolbox (web address provided above). 

A natural question is then how many different screens need to be 
included in a batch, such that the estimated expected colony size values 
will be reliable? We have investigated this question empirically, using 
measurements from the early secretory pathway E-MAP (Schuldiner et ah, 
2005). As increasingly many screens are completed in the same batch, the 
error in estimates of expected colony sizes decreases (Fig. 9.4C). There can 
be substantial error if a batch includes fewer than 20 screens. On the other 
hand, each additional screen beyond about the 40th gives only marginal 
improvement. 



4.3. Computing genetic interaction scores 

We compute genetic interaction scores as we have previously described 
(Collins et ah, 2006) as S-scores, which are closely related to £- values and 
account for both the magnitude of the interaction effect, as well as our 
confidence that the measurement constitutes a real genetic interaction. The 
S-score differs from a standard lvalue in several ways. Rather than compar- 
ing experimental observations explicitly to control screens, we use the 
expected colony size estimates described above. Additionally, a standard 
lvalue can be very sensitive to the standard deviation of a small number of 
experimental replicate measurements. In particular, if the replicates are 
unusually similar, resulting in an unusually small standard deviation, this 
can result in a large score even if the magnitude of the interaction is small. 
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We have found that the reproducibility of S-scores is substantially improved 
by implementing a lower bound on the standard deviation measurement 
(i.e., if the measured standard deviation is below the lower bound, we use 
the lower bound instead) (Collins et ah, 2006). This lower bound is an 
estimate of the typical standard deviation for double-mutant measurements 
with similar parent query and array strain phenotypes. 

4.4. Quality control 

Careful quality control is an essential part of the screening process. In our 
experience, it is not uncommon for ^10% of all strains (both array and 
query) to be incorrect. These incorrect strains need to be systematically 
identified, and removed from the analysis. 

The most effective tool for identifying incorrect strains is analysis of the 
apparent interaction scores between mutations at chromosomally linked loci 
(Collins et ah, 2006; Roguev et ah, 2008). The absence of apparent negative 
interactions between a mutation and mutations at loci within ~100 kb in 
S. cerevisiae or ~ 200 kb in S. pombe is strong evidence that a strain is incorrect. 
The MATLAB toolbox contains a graphical user interface for browsing the 
linkage data which facilitates the identification of incorrect strains. Addition- 
ally, cases where the data for an array strain is completely uncorrelated to data 
for a corresponding query strain should be identified systematically, and if 
found, the corresponding strains should be checked by PCR. 




5. Extraction of Biological Hypotheses 

Completion of a genetic interaction map creates a huge quantity of 
data. These data have proven useful in numerous instances for generating 
new hypotheses and helping guide ongoing work (Collins et ah, 2007a; 
Fiedler et ah, 2009; Keogh et ah, 2005; Kornmann et ah, 2009; Kress et ah, 
2008; Laribee et ah, 2007; Morrison et ah, 2007; Nagai et ah, 2008; 
Schuldiner et ah, 2005; Wilmes et ah, 2008). However, determining the 
best way to navigate these maps and generate hypotheses from them remain 
areas of active work. We expect that we and others will continue to find 
new and better ways to use the data, but we also describe here several 
general approaches that have proven useful in the past. 

5.1. Identifying genes acting in the same pathway using 
patterns of interactions 

One of the simplest and most useful applications of high-throughput genetic 
interaction data is the identification of sets of genes whose products work 
together very closely in a common biochemical pathway. For each pair of 
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mutations in a quantitative genetic interaction map, two distinct measures of 
their relationship can be recognized. First, the genetic interaction score 
(S-score) represents the degree of synergizing or mitigating effects of the 
two mutations in combination; this can be neutral (e.g., no interaction), 
positive (e.g., suppression), or negative (e.g., synthetic lethality). Second, 
the similarity (typically measured as a correlation) of their genetic interac- 
tion profiles represents the congruency of the phenotypes of the two 
mutations across a wide variety of genetic backgrounds. One would logi- 
cally expect both measures to be indicative of whether two genes act in the 
same pathway. Indeed, pairs of genes exhibiting positive genetic interac- 
tions and highly correlated genetic interaction profiles very frequently 
encode proteins that are physically associated (Collins et ah, 2007a). Fur- 
thermore, in cases where the proteins do not physically associate, they tend 
to act coherently in a biochemical pathway. This latter case is particularly 
informative as these are very close functional relationships which could be 
difficult to detect by other methods. The combined signature of a positive 
genetic interaction and highly correlated interaction patterns can be for- 
malized into a score (the COP score (Collins et ah, 2006, 2007a; Schuldiner 
et ah, 2005), and the sets of genes with this signature are also often readily 
apparent after hierarchical clustering of genetic interaction profiles. 

In general, the similarity of interaction profiles is much stronger evi- 
dence for membership in the same pathway than an individual positive 
interaction. However, the direct interaction score sometimes provides the 
critical distinction. For example, deletions of DOA1 and UBP6 give very 
similar patterns of genetic interactions (Fig. 9.5B), which likely reflects the 
fact that each deletion leads to depletion of ubiquitin (Collins et ah, 2007a). 
However, the double deletion results in a strong negative genetic interac- 
tion, presumably because they largely function independently of each other 
to maintain ubiquitin levels. 

It should be noted that interpretation of genetic interaction data derived 
from hypomorphic alleles of essential genes may differ from the data 
obtained from deletions of nonessential genes. For example, two hypo- 
morphic alleles may, in fact, give rise to a negative genetic interaction, if 
each mutation partially cripples the same pathway. However, a positive 
genetic interaction may still be observed if the two encoded proteins form a 
tight heterodimer, where the minimum of the two concentrations deter- 
mines the cellular phenotype. 

5.2. Using individual interactions to predict enzyme- 
substrate relationships 

In addition to identifying the core components of coherent pathways, one 
would like to have efficient strategies to suggest points of integration or 
cross-talk between pathways. One important class of such connections 
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Figure 9.5 Quantitative genetic data reveals insight into functional pathways. 
(A) Individual genetic interactions identify enzyme-substrate relationships. An 
E-MAP focused on the regulation of phosphorylation in budding yeast (Fiedler et ah, 
2009) revealed positive genetic interactions between the kinase Pkhl and it substrate, 
Sch9 (+2.2), between the phosphatase Sit4 and its substrate, Gcn2 (+2.3), and 
between a kinase (Mkkl) and a phosphatase (Ptcl) (+2.9) that acts on Slt2. The 
correlation of genetic interaction profiles between these pairs of genes is below the 
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between pathways includes protein-modifying enzymes and their cognate 
substrates. We have found that kinase— substrate and phosphatase— substrate 
pairs are enriched for positive genetic interactions (Fiedler et ah, 2009). 
Pkhl-Sch9 and Sit4-Gcn2 correspond to kinase— substrate and phosphatase— 
substrate relationships, respectively, and in both cases, the double deletions 
result in positive genetic interactions (Fig. 9. 5 A) (Fiedler et ah, 2009). 
However, in neither case is the correlation coefficient between the pairs 
notable. Thus, when looking for a critical substrate of a kinase, phosphatase, 
or other protein-targeting enzyme, the genes with the highest scoring 
positive interactions represent excellent candidates, even if the correlation 
between the interaction profiles is weak. It should be noted though, that 
kinases and phosphatases may have multiple substrates. In these cases, only 
enzyme— substrate pairs corresponding to modifications that significantly 
affect the phenotype being measured are likely to be detectable. 



5.3. Using individual interactions to predict opposing enzyme 
relationships 

Similarly, we have found that pairs of opposing kinases and phosphatases 
which share a common substrate are highly enriched for positive genetic 
interactions. Since these enzymes have opposing effects, they will also 
tend not to have correlated interaction patterns. An example is the mito- 
gen-activated protein kinase kinase Mkkl and the phosphatase Ptcl. 
These enzymes have opposing roles regulating the activity of the down- 
stream kinase Slt2. The two deletion mutations have a strong positive 
genetic interaction, but the interaction profiles are not correlated (Fiedler 
et ah, 2009) (Fig. 9. 5 A). Such positive interactions can be key clues for 
deciphering the role of uncharacterized genes and pathways. Indeed, strong 
positive interactions with the H3 K56 histone deacetylase Hst3 was an 
important piece of data suggesting that Rttl09 was the opposing acetylase 
(see below for further discussion of this pathway) (Collins et ah, 2007a). 



individual genetic interactions and was derived from the kinase E-MAP (Fiedler et ah , 
2009). (B) Functional connection between the deubiquitinase enzyme, Ubp6 and the 
ubiquitin chaperone, Doal. A strain containing deletions in both UBP6 and DOA1 
results in a strong negative genetic interaction (—7.8) and the genetic profiles generated 
from these deletions are highly correlated. (C) Using genetic interaction profiles to 
identify a pathway involved in genome integrity. Subsets of interactions (both negative 
and positive) for rttlOlA, mmslA, mms22A, rttl09A, and asflA are displayed. Some 
interactions are observed for all five deletions, interactions with the SWR complex are 
seen with only asflA and rttl09A whereas only deletion of ASF 1 result in negative 
interactions with histones H3/H4, the CAF complex, and factors involved in the 
Rpd3C(S) pathway (Set2, Eaf3, and Rcol). 
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5.4. Dissecting multiple roles of a single gene by detailed 
comparison of interaction patterns 

A major challenge in interpreting genetic interaction data is that many genes 
encode multifunctional proteins acting in multiple pathways. While it would 
be useful to be able to extract pathway-specific information, genetic interac- 
tions for these genes may arise from a role in one pathway or another, or the 
interactions may represent a mixture of effects from more than one pathway. 
We have found that in some cases, comparison of the interaction profile for a 
multifunctional gene with the profiles for other related genes can give impor- 
tant pathway-specific insights. For example, we identified a pathway involving 
the chromatin assembly protein Asfl, a then-uncharacterized protein Rttl09, 
and a putative ubiquitin ligase complex containing RttlOl, Mmsl, and 
Mms22 (Collins et al, 2007a). Asfl was the best characterized member of 
the group, but it had also been implicated in multiple cellular roles (Loyola and 
Almouzni, 2004). Comparison of the genetic interaction profiles for this group 
of genes suggested that several of these roles were Asfl -specific and that all of 
these factors work together in a pathway intimately related to histone H3 K56 
acetylation and maintenance of genomic integrity during DNA replication. 
Further experiments identified Rttl09 as the acetylase (Collins et al, 2007a; 
Driscoll et al, 2007; Han et al, 2007). 

In the above case, a critical observation was that a subset of asfl A' s 
interactions was unique to asfl A, and another large subset is shared uni- 
formly with the rest of the group. All members of this pathway display 
positive genetic interactions with one another, positive interactions with 
replication checkpoint genes (MRC1, TOF1, and CSM3), and negative 
genetic interactions with genes involved in DNA replication (POL30, 
ELG1, RAD27), the spindle checkpoint (BUB1, BUB2, and BUB3), 
DNA Repair (NUP60, NUP84, HEX3, SLX8) and ubiquitin regulation 
(UBP6, DOA1, RPN6, RPN10). Asfl was also known to be required for 
histone H3 K56 acetylation, and the shared positive interaction between all 
members of this pathway and the gene encoding a K56 deacetylase (HST3) 
suggested that this role of Asfl was likely to be central to the pathway (Celic 
et al, 2006; Maas et al, 2006). 

On the other hand, only asfl A displays negative interactions with factors 
involved in general chromatin assembly: the histone genes HHF1, HHF2, 
HHT1, HHT2 and components of the CAF complex (MSI1, CAC2, and 
RLF2), arguing that Asfl's general role in chromatin assembly is indepen- 
dent of its function with Rttl09. Similarly, only deletion of ASF1 results in 
negative interactions when combined with deletions of the Rpd3C(S) 
histone deacteylation complex (EAF3 and RCOl) and SET2, which 
codes for a histone methyltransferase enzyme. Eaf3, Rcol, and Set2 func- 
tion together in a histone deacetylation/methylation pathway that is 
required for maintaining chromatin integrity during transcription and 
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suppressing cryptic initiation by RNAPII (Carrozza et ah, 2005; Joshi and 
Struhl, 2005; Keogh et ah, 2005), again suggesting that Asfl alone impinges 
on this process. Indeed, deletion of ASF1 and not the other factors func- 
tioning in the K56 acetylation pathway results in spurious transcription 
rising from the inability to suppress cryptic initiation (Schwabish and 
Struhl, 2006). 




6. Perspective 

The term "epistatic" was originally used in 1907 by Bateson to 
describe a masking effect whereby a variant or allele at one locus prevents 
the variant at another locus from manifesting its effect (Bateson, 1907). 
Over the past one hundred years, geneticists have uncovered important 
biological insight from "epistatic" interactions, first qualitatively and then 
much more efficiently through quantitative analysis. The vast majority of 
genetic interaction data has been collected from simpler systems like yeast 
and bacteria using basic read-outs like colony size or growth rates. We are 
now in a position to collect this type of data in multicellular systems and the 
phenotypic read-outs, which are expanding exponentially, can occur at the 
organismal level, providing invaluable information about not only func- 
tional pathways that comprise key biological processes but also about 
evolution and behavior. 
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Abstract 

In this chapter, we describe a series of genome-wide, cell-based assays that 
provide a solid basis for understanding drug-gene interactions, gene function, 
and for defining the mechanism of action of small molecules. Each of these 
assays takes advantage of the ability to grow complex pools competitively and 
to use high-density microarrays that report the results of such screens. 
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The assays described here take advantage of alterations in gene dosage of 
Saccharomyces cerevisiae, and include HIP (haploinsufficiency profiling), HOP 
(homozygous profiling), and MSP (multicopy suppression profiling) as genetic 
tools to understand gene function and drug mechanism. The common experi- 
mental theme is that, in each assay, strains are pooled and screened in parallel 
to investigate the relative contribution of each gene product to sensitivity or 
resistance to a drug or environmental perturbation across the genome in a 
single assay. Further, the compendium of results from these screens can inform 
large-scale network analysis of genetic function, gene-gene interactions, and 
mechanism of drug action. 




1. Introduction 

Our initial pooled screening platform was designed to interrogate the 
yeast deletion collection, a genome-wide set of arrayed single-gene deletion 
strains in Saccharomyces cerevisiae (Giaever et ah, 2002) that can be used in 
place of (or as a complement to) random-mutant libraries and individually 
constructed strains. One advantage of this collection over others is that it 
includes all genes and therefore does not suffer from biases. The collection 
has enabled the development of methods for studying all ~ 6000 deletion 
strains in a single culture in parallel, in a single assay (Giaever et ah, 2002; 
Shoemaker et ah, 1996; Winzeler et ah, 1999). Specifically, unique 20 bp 
DNA "barcodes" or "tags" incorporated into each strain enable relative 
strain abundance to be determined by amplifying the barcodes using com- 
mon flanking primers and hybridizing the amplicons to a microarray that 
carry the tag complements (Fig. 10.1). For example, pooled analysis of all 
~6000 heterozygous deletion strains can be used to identify novel drug 
targets (Giaever et ah, 1999, 2004; Lum et ah, 2004) based on their relative 
sensitivity in the /zaplomsufficiency profiling (HIP) assay. The rationale 
behind this technique is as follows: if a locus encodes the target of a drug 
or small molecule and this target is important for growth, then the decreased 
gene dosage in the heterozygous deletion strain, combined with further 
reduction in gene function due to drug binding, will confer increased 
sensitivity to the drug. Therefore, in theory, the most sensitive strain in 
the pool will be heterozygous for the drug target (Ghose et ah, 1999; 
Giaever et ah, 2004; Lum et ah, 2004). In a similar manner, analysis of 
competitive assays using the collection of ~ 4700 nonessential genes with 
the homozygous profiling (HOP) assay can reveal information about the 
drug target pathway, such as buffering interactions. In this case, the assay 
mimics a double deletion mutant, because one gene is completely absent, 
and the second is diminished in function by the action of the compound 
(Hillenmeyer et ah, 2008; Parsons et ah, 2004, 2006). Thousands of 
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Figure 10.1 Illustration of the chemogenomic platform that interrogates yeast deletion 
and overexpression pools with a single TAG4 array. Fitness profiling of pooled deletion 
strains involves six main steps: (1) Strains (or multicopy transformants) are pooled at 
approximately equal abundance. (2) The pool is grown competitively in the condition of 
choice. If a gene is required for growth under this condition, the strain carrying this 
deletion will grow more slowly and become underrepresented in the culture (light- 
coloured strain). Conversely, the strain carrying a plasmid that confers resistance to 
compound will become overrepresented in the culture (light-coloured strain). (3) In 
deletion profiling, genomic DNA is isolated from cells harvested after a predefined 
number of generations. For MSP, plasmid DNA is isolated when the experiment is 
complete. (4) Barcodes are amplified from the genomic DNA with universal primers in 
either one (MSP) or two (deletion profiling) PCR reactions. (5) PCR products are 
hybridized to an array with complementary probes. (6) Tag intensities for the treatment 
sample are compared to tag intensities for a control sample to determine the relative 
fitness of each strain. Genes that confer both resistance when overexpressed and sensitiv- 
ity when deleted are more likely to be directly related to the drug's mechanism of action. 



genome-wide screens performed to date, including those described above, 
have provided a wealth of functional information on the yeast genome 
(Alberts, 1998; Birrell et ah, 2002; Deutschbauer et ah, 2005; Giaever et ah, 
1999, 2002, 2004; Kastenmayer et ah, 2006; Lee et ah, 2005; Lum et ah, 
2004; Ooi et ah, 2001; Parsons et ah, 2004, 2006; Shoemaker et ah, 1996; 
Steinmetz et ah, 2002; Winzeler et ah, 1999) and have helped make yeast the 
most well-characterized organism to date. 
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Multicopy suppression profiling (MSP) is essentially the reverse of the 
HIP and HOP assays, focusing on increased gene dose versus decreased 
gene dose. While the concept is not new, current parallel screening tech- 
nologies provide results on a genome-wide scale at a far greater level of 
resolution. DNA clone libraries overexpressing gene products are screened 
competitively to identify those that confer resistance to compounds 
(Fig. 10.1). Traditional multicopy suppressor screens involve cumbersome 
plating techniques (requiring large amounts of compound) and necessitate 
sequencing of individual clones (Rine et al, 1983). Moreover, if the wrong 
time point is assayed, or the wrong drug concentration is used, the result can 
be dominated by a gene product unrelated to the drug target, for example, a 
drug pump. MSP is a variation of the overexpression resistance concept and 
uses a high copy, random genomic library (Hoon et ah, 2008) or an 
inducible ORF library (Butcher and Schreiber, 2006). MSP easily allows 
for collection of several time points to avoid domination of the culture by 
any one strain. Subsequent analysis allows assessment of the relative resis- 
tance of all clones, and an amplification step followed by hybridization to an 
array (TAG4) ranks the most predominant clones based on their relative 
abundance. More recently, systematic collections of yeast overexpression 
clones have become available (Ho et al, 2009; Jones et ah, 2008). Because 
these collections are complete, not subject to library biases, and (in the case 
of Ho et al, 2009) barcoded, these libraries will clearly improve and expand 
the results that can be obtained using the MSP assay. 

It is worth noting that both the HIP/HOP (loss-of-function) screens and 
MSP (gain-of-function) screens are each valuable on a screen-by-screen basis. 
However, combining results from all assays allows one to examine the effect of 
both increasing and decreasing gene dosage which can be particularly useful 
for distinguishing the bona fide target from a longer list of potential candidates. 
Table 10.1 provides a summary of the uses and complementary information 
gleaned by using the three screening methods. For example, sensitivity in the 
HIP assay and resistance in the MSP assay does not guarantee that the affected 
gene product is the bona fide drug target as the gene could be a drug pump. 
However, the HOP assay can eliminate such a pump as a possible target 
because if the corresponding deletion strain displays increased sensitivity in 
the HOP assay it is unlikely to be a direct target of the drug. 

In addition to generating information about single genes and individual 
drugs, large sets of experiments can reveal novel relationships between genes 
and drugs. For example, for a given gene pair, a greater correlation of fitness 
values (cofitness) across conditions suggests functional relatedness of the two 
genes. Similarly, for a given drug pair, the correlation of fitness values across 
the strains is a measure of the similarity of their mechanism of action, and 
often, their structure (Hillenmeyer et ah, 2008). This type of analysis allows 
characterization of both genes and compounds with previously unknown 
function, and is one of the primary aims of functional genomics. 
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Table 10.1 Interpreting results from HIP, HOP, and MSP 



Assay results 


HIP* 


HOP fo 


MSP C 


Direct molecular target of drug, 


s 


NA 


R 


essential gene 








Direct molecular target of drug, 


s 


R 


R 


nonessential gene 








Essential genes encoding proteins 


s 


NA 


not S or R 


involved in target pathway 








Genes involved in drug detoxification, 


s 


NA 


R 


essential genes 








Genes involved in drug detoxification, 


s 


S 


R 


nonessential genes 









genes essential for survival 

genes not essential for survival 

All genes 
S, sensitive; R, resistant; NA, not applicable. 

Because assay readout is growth, most targets are necessarily essential for survival except those nones- 
sential deletions that have a growth phenotype in the absence of compound. 




2. Methodology 

The pooled fitness assay involves seven main steps, described in detail 
below. 

Yeast deletion strain pool and MSP pool construction 

Determining the drug dosage 

Experimental pool growth 

Purification and amplification 

Hybridization 

Analysis of results 

Confirmation of microarray data 



2.1. Yeast deletion strain pool and MSP pool construction 

Allow 1 week to generate pooled aliquots of cells. Pooling is performed 
infrequently and cells can be stored indefinitely at — 80 °C. 



2.1.1. Yeast deletion strain pool construction 

1. Thaw the frozen glycerol stocks completely for the strains of interest 
(such as the genome-wide homozygous deletion collection) but avoid 
exposing thawed cells to room temperature for more than 2 h. 
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2. To sterilize a 96-well pin tool, dip the pin tool in water to rinse away any 
remaining cells, followed by 2 dips in 70% ethanol baths (pipette tip box 
lids work well), flame the pin tool, and cool for 1 min. The level of the 
ethanol baths should exceed the level in the water bath to ensure all carry- 
over cells are flamed and removed. Replace water every 4—6 pinnings. 

3. Insert the sterile 96-well pin tool into a thawed 96-well plate, swirl 
gently and transfer it to a Nunc Omni Tray containing YPD-agar 
including the appropriate antibiotic. Following the same procedure, 
transfer the cells from all remaining 96-well plates to agar plates. Grow 
colonies until they reach maximal size at 30 °C (2—3 days). 

4. After colonies have reached full size, make note of any missing or slow 
growing strains. These should be individually repinned and added at 
twice the amount as the rest of the collection. 

5. Working in a sterile environment, flood plate with 5—10 ml media, soak 
for 5 min, and resuspend colonies with a cell spreader. Pour the liquid 
plus cells into a 50 ml conical centrifuge tube and add glycerol to 15% or 
DMSO to 7% (vol/vol). 

6. Measure the OD 600 of the pool and adjust (by dilution or centrifugation 
and resuspension) to a final concentration of 50 OD 600 /ml with media 
containing 15% glycerol or 7% DMSO. 



2.1.2. MSP pool construction 

1. Take a suitable S. cerevisiae random genomic library (we typically use a 
library constructed in a high-copy 2 /im expression vector (YEplacl95)) 
and transform into yeast (BY4743) (Brachmann et ah, 1998; Winzeler 
et ah, 1999) by a standard lithium acetate method (Gietz and Schiestl, 
2007) and select on medium lacking uracil (ura— ). The MOB-ORF 
collection (Ho et ah, 2009) should work quite well for MSP. 

2 . After 3 days of growth, make sure you have at least 1 transformant colonies. 

3 . Resuspend colonies by flooding the plates as described above and pool into 
ura-medium containing 7% DMSO, aliquot as 10—25 jA samples of pool 
into individually capped PCR tubes, and store at — 80 °C aliquot until use. 



2.2. Determining the drug dosage 

Successful genome-wide assays require a drug/compound dosage that 
affects the rate of pooled growth. To determine this dose for HIP/HOP 
and MSP, we prescreen compounds using wild-type cells. For the deletion 
(loss-of-function) screens, we aim for treatment doses that cause a 10—20% 
decreased growth rate on wild type (Fig. 10.2). At 10—20% inhibition, 
optimal results for the heterozygous collection are usually obtained at the 
20 generation time point, and optimal results for the homozygous collection 
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Figure 10.2 Prescreening compounds against wild-type yeast to determine an appro- 
priate dose for genome-wide screening. (A) A 96-well flat bottom plate is filled with 
100 [A of cell suspension at an OD of 0.062. Two microliters of compound (typically 
dissolved in DMSO) is added using a slotted pin tool or multichannel pipette. Cells are 
grown with constant shaking for 16-20 h at 30 °C. The final concentration of DMSO 
should not exceed 2%. In this example, column 12 contains vehicle (DMSO) control, 
and columns 1-11 contain decreasing amounts of test compound. In each well of the 
plate, the growth curve in test compound is plotted in black against a plot of the control 
growth curve in grey. (B) Higher resolution image of several prescreens obtained with 
5-FU overlaid on top of one another. In this titration series, an IC 10 _i5 is obtained with 
4.29 fiM 5-FU and an IC 70 is obtained with 32 \iM 5-FU. The former dose would be 
appropriate for deletion profiling (HIP and HOP) whereas the latter dose would be 
appropriate for resistance profiling (MSP). Due to the nonlinearity at higher optical 
densities, Tecan ODs were converted to "traditional" 1 mm cuvette ODs using the 
calibration function "real OD"=-1.0543 + 12.2716 x measured OD. 



are usually obtained at the 5 generation time point. Heterozygous strains 
have more subtle differences in growth rate (typically < 5%) and therefore 
require a longer time to resolve growth differences. For the MSP (gain-of- 
functions) screens, we aim for a compound dose that inhibits wild-type 
growth by 70—90%. In this case, cells are collected when the OD of the 
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culture reaches 2.0, regardless of the number of generations required to 
achieve this culture density. These generation times and doses have been 
empirically determined over the last several years and provide a well- 
informed starting point; nonetheless variations in generation times and 
degree of inhibition may improve results. 



2 



.2.1. Materials for yeast deletion strain pool and MSP pool 
construction and determining the drug dosage 

1 . Frozen glycerol stocks of the yeast deletion collection in 96-well microtiter 
plates (Op enBiosys terns, Part Nos. YSC1056 and YSC1055). 

2. Nunc Omni Trays (VWR, Catalog No. 62409-600). 

3. 96-well pin tool (V&P Scientific, Catalog No. VP407A). 

4. 30 °C incubator for growing plated yeast. 

5. Spectrophotometer. 

6. G418, Nourseothricin (Agri-Bio, Catalog No. 3000, Werner Bio- 
Agents, Catalog No. 500100). 

7. Cell spreader (VWR, Catalog No. 89042-020). 

2.3. Experimental pool growth for deletion and 
overexpression collections 

The laboratory procedure starting from thawing the frozen aliquots of 
pooled cells is described below and visualized in Fig. 10.3. Examples of 
results using the HIP, HOP, or MSP assay can be seen in Fig. 10.4. 

2.3.1. Deletion profiling (HIP/HOP) 

1. Thaw a frozen aliquot of pooled cells on ice. 

2. Immediately dilute the pool into media with drug or condition of 
choice, in parallel with the appropriate solvent controls. Inoculate 
cultures using either option a or b. 

(a) Automated robotic cell growth. Inoculate cells in medium with drug or 
vehicle at OD 600 of 0.0625 in a total volume of 700 jA in a 48-well 
plate, and seal with a plastic plate seal. Similarly, prepare wells as 
above but without cells to be used by the robot for the inoculation at 
5-generation intervals. If the condition requires optimal aerobic 
growth (e.g., no nfermen table carbon sources) poke a small hole in 
the membrane seal toward the side of each well. Grow in a spectro- 
photometer at 30 °C with at an experimentally determined optimal 
shaking regimen. Cells can normally be grown to a final OD 600 of 
2.0 which equates 5 generations of growth. Part of the cell suspen- 
sion can be saved on a cold plate on the robotic deck at user defined 
generation times (http://med.stanford.edu/sgtc/technology/access. 
html, for details contact C. Nislow or G. Giaever). 
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Figure 10.3 Timeline for pooled growth experiments. (1) Cultures are inoculated 
using thawed aliquots of pooled cells. (2) Cells are grown for the desired number of 
generations (typically 5-20 generations, 16-72 h). The specific amount of time needed 
for growth will vary depending on the number of generations and the level of inhibition 
of the treatment. (3) Cells are harvested by centrifugation. (4) Genomic DNA or 
plasmid DNA is purified from the cell pellets using standard column-based purification 
kits. (5) Tags (or genomic DNA fragments) are PCR-amplified. (6) PCR products are 
hybridized to an array, which is then washed and scanned. 

(b) Manual cell growth. Inoculate a 50 ml culture at a starting OD 600 of 
0.0020 in a 250 ml culture flask. Grow in a shaking incubator at 
30 °C at 250 rpm. After cells reach a final OD 600 of 2.0 they will 
have undergone 10 generations of growth. 

3. Collect 1—2 OD 600 of cells in a 1 .5 ml microfuge tube after growth to the 
desired number of generations. Pellet cells, and remove the media. If 
option (b) above has been used, normalize the ODs between all samples. 

4. If not proceeding directly to the next step, cell pellets can be stored at 
-20 °C. 
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Figure 10.4 Examples of HIP-HOP and MSP results. Panels (A-D) show the results 
of HIP or HOP screens in which the heterozygote deletions strains were grown for 20 
generations and the homozygous strains for 5 generations. For methotrexate (A) and 
lovastatin (B), the deletion screens identify the drug targets (DFR1 and HMG1, 
respectively) as the most sensitive heterozygotes. For cisplatin (C), several homozy- 
gotes deleted for components of the DNA damage response appear as sensitive 
(Lee et ah, 2005). For clozapine (D), an atypical antipsychotic compound for which 
no known target exists in yeast, a number of nonessential genes involved in vesicle 
traffic and protein sorting appear as significantly sensitive (Ericson et ah, 2008). Panels 
(E-F) show the results of two MSP screens. In the case of methotrexate and flucona- 
zole, the known targets (DFR1 and ERG11, respectively) appear as the most resistant 
clones (Hoon et ah, 2008). 
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2.3.2. MSP profiling 

MSP assays are performed as described above, except that cells are grown in 
medium lacking uracil, in the presence of high doses of compound, and that 
cells are collected when the OD of the culture reaches 2.0. Because it is 
impossible to predict when a resistant clone (or clones) will "overtake" the 
culture, we typically allow 2—4 days for a typical MSP growth assay. 

Note: A starting cell sample (i.e., a "TO time point") is required to assess 
initial strain representation in the newly created pool. To prepare such a 
sample, add 1—2 OD 600 of pool directly from the freezer aliquots to a 1.5 ml 
tube and process as described above. 

When choosing between option (a) or (b) for the pool growth, consider 
the following, option (a) features a higher throughput, smaller culture 
volume and compound cost, and automated inoculation and cell suspension 
steps compared to option (b). However, option (a) has a higher sampling 
error due to small culture volume. The sampling error during cell growth is 
currently the largest source of error when growing in small cultures. Option 
(b), allows cells to be collected at ~1000 cells/strain which will increase 
accuracy. However, manual growth is limited in the ability to collect cells at 
exact generation times which is important for the downstream analysis. For 
both options (a) and (b), the condition and control samples need to be 
collected at the same number of generations. 

2.3.3. Materials 

1. Temperature controlled shaker for 250 ml flasks or shaking spectropho- 
tometer such as Infinite F200 (Tecan; www.tecan.com). Note: many 

spectrophotometers do not shake sufficiently hard for growing yeast in 
suspension). 

2. 48-well plates (Greiner, Part No. 677102). 

3. Adhesive plate seals (ABgene, Catalog No. AB-0580). 

4. 250 ml culture flasks (if growing cultures in flasks). 



2.4. Purification and amplification of barcodes and ORFs 
2.4.1. Deletion profiling (HIP/HOP) 

1. Purify genomic DNA from ~2 OD 600 of cells with the Zymo Research 
YeaStar kit using Protocol I (included with the kit), or another suitable 
method for purifying yeast genomic DNA. Genomic DNA can be 
stored indefinitely at —20 °C. 

2. If desired, quantify genomic DNA using a gel or a UV spectrophotometer. 

3. Set up two 60 fA PCR reactions for each sample, one for the up tags and 
one for the downtags (33 fA dH 2 0, 6 fA 10 X PCR buffer without 
MgCl 2 , 3 fA 50 mMMgCl 2 , 1.2 fA 10 mM dNTPs, 1.2 fA 50 juMUp 
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or Down primer mix (see below), 0.6 fA 5 U//il Taq polymerase, ~0.1 
jig genomic DNA in 15 jA). 

4. Cycle as follows in a thermocycler with a heated lid: 94 °C 3 min; repeat 
30 x : 94 °C 30 s, 55 °C 30 s, 72 °C 30 s; 72 °C 3 min, hold at 4 °C. PCR 
products can be stored at —20 °C. 

5. Check the resulting PCR products on a gel, expecting a ~ 60 bp product 
(see note 4 under Section 3). 

6. If not proceeding to the next step immediately, store PCR products at 
— 20 °C in a nonfrost free freezer. 

2 AAA. Materials 

1. YeaStar Genomic DNA Kit (Zymo Research, Catalog No. D2002). 

2. Taq DNA polymerase (Invitrogen, Catalog No. 10342). 

3. dNTPs (Invitrogen, Catalog No. 10297). 

4. 10 X PCR reaction buffer (MgCl 2 ) (Invitrogen, Part No. 52724). 

5. 50 mMMgCl 2 (Invitrogen, Part No. 52723). 

6. Up primer mix: Dissolve Uptag (5'-GAT GTC CAC GAG GTC TCT- 
3') and Buptagkanmx4 (5' biotin-GTC GAC CTG CAG CGT ACG- 
3 7 ) each in dH 2 at 100 pmol//il, then mix in a 1:1 ratio for a final 
concentration of 50 pmol//il each. Store at —20 °C. 

7. Down primer mix: Dissolve Dntag (5'-CGG TGT CGG TCT CGT AG- 
3') and Bdntagkanmx4 (5' biotin-GAA AAC GAG CTC GAA TTC 
ATC G-3') each in dH 2 at 100 pmol/^1 (= 100 fiM), then mix in a 1:1 
ratio for a final concentration of 50 pmol//il each. Store at —20 °C. 

8. Thermocycler with heated lid. 

2.4.2. MSP 

1 . Isolate plasmids using the Zymoprep II plasmid isolation kit. 

2. Amplify inserts by PCR using common Ml 3 primers in amplification 
conditions described above (Ml 3 forward primer: 5 7 -GTT GTA AAA 
CGA CGG CCA GT-3'; M13 reverse primer: 5'-CAG GAA ACA 

GCTATGACC-3'). 

3. Purify PCR products using QIAquick PCR purification kit. 

4. Label inserts with biotin using the BioPrime labeling kit. 

5. Hybridize labeled inserts to Affymetrix TAG4 arrays using the same 
protocols as described for TAG hybridization below. 

2. 4.2 A. Materials 

1. Zymoprep II plasmid isolation kit (Zymo Research; Catalog No. D2004) 

2. FailSafeTM PCR System (EPICENTRE Biotechnologies) 

3. QIAquick PCR purification kit (Qiagen; Catalog No. 28104) 
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4. Bioprime labeling kit (Invitrogen; Catalog No. 18094-011) 

5. M13 forward primer: 5'-GTT GTA AAA CGA CGG CCA GT-3' 

6. M13 reverse primer: 5'-CAG GAA ACA GCT ATG ACC-3' 



2.5. Hybridization 

1. Set up a boiling water bath with a floating rack for 1.5 ml tubes and a 
slushy ice bucket. Set hybridization oven temperature to 42 °C. 

2. Fill the arrays with lx hybridization buffer. 

3. Prewet the array at 42 °C and 20 rpm for at least 10 min in the 
hybridization oven. 

4. Immediately before use, prepare 90 jA hybridization mix per sample 
(75 /A 2x hybridization buffer, 0.5 jA B213 control oligonucleotide 
(0.2 fm//il), 12 jA mixed oligonucleotides (12.5 pm//il), 3 /A 50 X 
Denhardt's solution) in 1.5 ml tubes suitable for boiling. MSP hybri- 
dization conditions are identical except MSP does not employ any 
blocking oligos. 

5. While arrays are equilibrating, add 30 /A of uptag PCR and 30 /A of 
downtag PCR to 90 /A of hybridization mix for a total volume of 
150 [A. (For MSP add the entire labeling reaction.) 

6. Boil for 2 min and set on ice for at least 2 min. 

7. Spin tubes briefly. 

8. Remove hybridization buffer from the arrays and replace with the hybe 
mix (kept on ice). 

9. Place a Tough-Spot (or other adhesive tape) over each of the two 
gaskets to prevent evaporation and hybridize for 16 h at 42 °C and 
20 rpm. 

10. Immediately before use, prepare 600 jA biotin labeling mix per sample 
(180 jA 20 x SSPE, 12 jA 50 x Denhardt's solution, 6 fil 1% Tween 20 
(vol/ vol), 1 /A 1 mg/ml strep tavidin-phycoerythrin, 401 jA dH20). 

11. Aliquot 600 /A biotin labelling mix per chip into tubes. Remove 
Tough-Spots from chips. 

12. Remove hybridization mix from the microarrays and save it in — 20 °C 
(can be reused if needed). Fill chips with Wash A. 

13. Wash the microarrays using the Affymetrix fluidics station according to 
the manufacturer's instructions. After priming the station, use the 
"Gene-Flex_Sv3_450" protocol with the following modifications: 1 
extra step with Wash A (1 cycle, 2 mixes) before staining, Wash B 
temperature 42 °C instead of 40 °C, stain at 42 °C instead of 25 °C. 
This same protocol is used for processing MSP samples. Note: it is 
possible to perform the post hybridization wash, the biotin staining 
and the post staining wash manually, see page 396 in C. Nislow and 
G. Giaever, 2007. 
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14. Verify the absence of air bubbles in the microarrays. If bubbles are 
present, insert the chip again, and it will automatically be refilled with 
Wash A. If there are any marks or smudges on the array surface, clean 
the glass window on each array with isopropanol and a cotton swab or 
lint-free wipe. Put Tough-Spots on the gaskets to prevent evaporation 
and put arrays in scanner. 

15. Scan at an emission wavelength of 560 nm. 

16. When all fluidics operations are complete, run the fluidics station 
"SHUTDOWN_450" protocol. 

2.5.1. Materials 

1. Genflex Tag 16K Array v2 (Affymetrix, Part No. 511331). 

2. Hybridization Oven 640 (Affymetrix, Part No. 800138). 

3. GeneChip Fluidic Station 450 (Affymetrix, Part No. 00-0079). 

4. GeneArray Scanner 3000 (Affymetrix, Part No. 00-0212). 

5. 1.5 ml Microfuge tubes (suitable for boiling). 

6. Boiling water bath with floating rack for 0.5 ml tubes. 

7. Teeny Tough-Spots (Diversified Biotech, Catalog No. TS-TNY). 

8. Denhardt's Solution, 50 X concentrate (e.g., Sigma, Catalog No. D-2532). 

9. Strep tavi din, R-phycoerythrin conjugate (SAPE) (Invitrogen, Catalog 
No. S-866). Store at 4 °C. Do not freeze. 

10. B213 oligonucleotide: (5' biotin-CTGAACGGTAGCATCTTGAC- 

11. Mixed oligonucleotides: Dissolve each of the following eight oligos in 
dH 2 at 100 pmol/jul: Uptag (5'-GAT GTC CAC GAG GTC TCT- 
3'), Dntag (5'-CGG TGT CGG TCT CGT AG-3'), Uptagkanmx (5'- 
GTC GAC CTG CAG CGT ACG-3'), Dntagkanmx (5'-GAA AAC 
GAG CTC GAA TTC ATC G-3'), Uptagcomp (5'-AGA GAC CTC 
GTG GAC ATC-3'), Dntagcomp (5'-CTA CGA GAC CGA CAC 
CG-3'), Upkancomp (5'-CGT ACG CTG CAG GTC GAC-3'), 
Dnkancomp (5'-CGA TGA ATT CGA GCT CGT TTT C-3'). 

12. 0.5 MEDTA (BioRad, Catalog No. 161-0729). 

13. 10% Tween: (Sigma, Catalog No. T2700). 

14. 12 X MES stock: For 10 ml, dissolve 0.7 g MES free acid monohydrate 
(Sigma, Catalog No. M5287) and 1.93 g MES sodium salt (Sigma, 
Catalog No. M5057) in 8 ml Molecular Biology Grade water. After 
mixing well, check pH and adjust if needed to a pH between 6.5 and 
6.7. Add Molecular Biology Grade water to a total volume of 10 ml. 
Filter through a 0.2 fiM filter, shield from light by wrapping tube in 
foil, and store at 4 °C. Replace if solution becomes visibly yellow or 
after 12 months, whichever comes first. 

15. 2x Hybridization buffer: For 50 ml, mix 8.3 ml of 12 X MES stock 
(prepared as above), 17.7 ml of 5 M NaCl (J. T. Baker, Catalog No. 
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3624-01), 4.0 ml of 0.5 MEDTA, 0.1 ml of 10% Tween 20 (vol/vol), 
19.9 ml filtered dH 2 0. Filter through a 0.2 /iM filter. 

16. Wash A: Mix 300 ml 20 x SSPE (Sigma, Catalog No. S2015), 1 ml 10% 
Tween (vol/vol), 699 ml filtered dH 2 0. Filter through a 0.2 /iM filter. 

17. Wash B: Mix 150 ml 20 x SSPE (Sigma, Catalog No. S20 15), 1 ml 10% 
Tween (vol/vol), 849 ml dH 2 0. Filter through a 0.2 \iM filter. 



2.6. Analysis of results 

2.6.1. Outlier masking (for HIP and HOP) 

The Affymetrix TAG4 array contains at least five replicate features for each tag 
probe, dispersed across the array so that outlier features can be identified and 
discarded before calculating an average intensity value for each tag. 

1. For each array feature, examine the surrounding 5 features X 5 features. 
If 13 (or more) of the 25 probes in this region differ from their trimmed 
replicate mean (the mean of the three middle replicates, excluding the 
highest and lowest replicates) by more than 10%, this probe is not 
suitable for data analysis. 

2. Once these outlier-dense regions have been identified, pad them by 
including all probes within a 5-probe radius, as defined by ((^1 — ^2) + 
(yi~ Y2) ) < 6 where x<\_, x 2 , J\, and y 2 are the x and y coordinates for 
the two features. 

3 . Discard features for which (standard deviation of feature pixels/mean 
feature pixels) >0.3. The standard deviation is included in the .eel file 
for Affymetrix arrays. 

4. After identification and removal of outliers, calculate intensity values for 
each tag by averaging all unmasked replicates. 



2.6.2. Saturation correction (for HIP, HOP, and MSP) 

The signal on the TAG4 array is not linearly related to tag concentration 
because of the phenomenon of feature saturation (Pierce et ah, 2006). If 
uncorrected, this saturation will cause the degree of sensitivity or resistance 
to be underestimated for strains with brighter tags. Saturation is corrected by 
comparing uptag and downtag ratios, specifically: 

1. Using a pair of arrays that are not sample replicates, calculate ln(/ c — bg)/ 
(i t — bg) for each tag, where i c is the control intensity, i t is the treatment 
intensity, and bg is the background as estimated by taking the mean 
intensity of the unassigned tag probes. 

2. Mark ratios for any tags with minimum values less than 3 X background 
as unusable. 
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3. Pair uptag and downtag ratios by strain. Ignore ratios for any strains 
with less than two usable tags. 

4. For each strain, calculate the difference in average intensity for the two 
tags: (/ tu + i cu )/2—(i td + *' c d)/2, where the subscript u indicates the 
uptag, and the subscript d indicates the downtag. 

5. Sort ratios by the difference in average intensity. Take a sliding window 
of 600 ratio pairs, sliding 100 pairs at a time. For each window fit a 
line to the uptag ratios (x-axis) versus the downtag ratios (y-axis) using 
least-squares and take the slope. Calculate the mean of the differences 
in average intensity for the window. 

6. Fit a least-squares line to the intensity differences for all windows (x-axis), 
versus slopes for all windows (y-axis) and take the slope of this line. 

7. Repeat using the reverse intensity difference: {i t & + i cd )/2— (i tu -\- i cu )/ 
2, and taking the slope with the axes reversed: the downtag ratios on 
the x-axis, and the uptag ratios on the y-axis. 

8. Average the slope calculated in step 6 with the slope in step 7. This is 
the saturation correction factor S. A typical range for 5 is 0.0001—0.0005. 

9. Adjust the raw intensity data using the following transformation: 
£(i) = ie . 

10. To correct more than two arrays, calculate 5 for all possible pairs of 
arrays, then use the median as the correction factor for all arrays in the 
set. Using a larger group of arrays will improve the accuracy of S. 

2.6.3. Array normalization 

The uptags and downtags should be normalized separately, because they are 
amplified in separate PCR reactions, and the intensities of the individual 
PCR reactions will affect their array intensities. For MSP ORF probes, 
quantile or mean normalization is performed without distinguishing 
between probes because the MSP PCR is performed in a single reaction. 
Normalize using either quantile normalization or mean normalization. 

To quantile normalize a set of arrays: rank values obtained from each array for 
each set of tags (up and down) in order of increasing intensity. For each rank, 
assign the tag at that rank for each array to the median of all values at that rank. 

To mean normalize a set of arrays: for each set of tags (up and down), divide 
by the mean. Multiply each tag set by the mean across all arrays (this is for 
convenience only; it returns the tag intensities to approximately their 
original range). 

2.6.4. Removing unusable tags 

Tags with low-intensity values in control samples will give poor-quality 
results. An intensity value threshold for excluding these tags can be chosen 
by comparing the correlation of uptag and downtag ratios as a function of 
tag intensity. Specifically: 
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1. Using any treatment-control pair, calculate log2(/ c — bg)/(i t — bg) for each 
tag, where i c is the control intensity, i t is the treatment intensity, and bg is 
the mean intensity of the unassigned tag probes. 

2. Pair up tag and down tag ratios by strain and for each tag pair, take the 
minimum intensity for the two tags in the two samples. Sort the ratio 
pairs by this minimum intensity. 

3. Use a sliding window of size 50 on the ranked ratio pairs, starting with 
the lowest intensity pairs. Calculate the correlation of uptag and down- 
tag ratios for pairs within the window. Also calculate the average of the 
minimum intensities calculated in the previous step. 

4. Slide the window by 25 pairs, and repeat the previous step until all pairs 
have been traversed. 

5. Plot the average minimum intensity versus the uptag— down tag correla- 
tion for all windows. 

6. Chose an intensity threshold, for example, we find the intensity value 
where the correlation first reaches 80% of its maximum level is a good 
cutoff. 

7. Mark any tags that are below this cutoff in either of the samples as 
unusable for subsequent analysis. 



2.6.5. Calculating log2 ratios for deletion pool screens 

1. For each tag, calculate log2(/i c —bg)/(fi t —bg) where ji c is the mean 
intensity for the control samples, fl t is the mean intensity for the treat- 
ment samples, and bg is the mean intensity of the unassigned probes. 

2. For each strain, average the log2 ratios for all usable tags to obtain a final 
sensitivity score. Strains that are sensitive will have positive scores, while 
strains that are resistant will have negative scores. This score is propor- 
tional to the log2 ratio of cells present in the control sample versus the 
treatment sample. 

Note: Whereas p values describe the level of confidence for calling strain 
sensitivity, the log2 ratios give the best estimate of the actual level of 
sensitivity for each strain. 

For MSP, as we are looking for gain of signal, it does not affect the results 
significantly if low- intensity tags are present in the analysis. 

2.6.6. Calculating MSP ratios 

For MSP, each ORF is represented by at least two probes and the log2 ratios 
of each probe are averaged to generate a single score for each gene. To 
identify each suppressor locus, order the log2 ratio of intensities by each 
ORF's genomic location and perform the analysis using a sliding window to 
identify loci that have at least two adjacent ORFs with log2 ratios > 1.6. 
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All scripts for performing these analyses on the array data can be found at 
these two URLs, the first describes the analyses, the second contains the 
actual PERL scripts. 

http : //chemo genomics . Stanford. edu/supplements/04tag/analysis 
http://chemogenomics.stanford.edu/supplements/04tag/download. 
html#scripts 

2.7. Confirmation of microarray data 

After performing a pooled fitness assay, several strains are often candidates for 
interacting with the test compound, based on their sensitivity or resistance. The 
choice of which strains to confirm after performing a pooled assay is somewhat 
arbitrary, but as a general guideline, the sensitivity scored using the pooled 
fitness approach tends to be confirmed for the majority of deletion strains with 
z-scores>3 (Lee et al, 2005). Requiring a fold change of at least 2 (log2=l) 
avoids testing strains associated with z-scores that are high only because they 
have a very reproducible abundance pattern in the control condition. 

While individual confirmation assays can be conducted in any desired 
growth format, we perform them in 100 jA in 96-well plates, using the same 
drug (and vehicle) concentration that was used in the genome-wide screen. 
The 96-well plate format allows for the testing of several strains in replicate 
in parallel, and limits the amount of drug required. Because some drugs 
loose efficacy over time, it is crucial to include a wild-type strain, preferably 
on the same growth plate. Ideally, all strains should be tested in triplicate in 
both drug and vehicle, for robust statistical analysis. In most cases, a deletion 
strain is confirmed as sensitive to the drug when its growth (compared to 
wild type) is not affected in the control condition, but is inhibited by drug 
(see yptlA in Fig. 10.5). The vehicle control accounts for deletion strains 
that exhibit a growth defect relative to wild type in the absence of com- 
pound (e.g., erg4A in Fig. 10.5). In some cases, a growth defect may also be 
observed for both vehicle and drug, but the strain is still considered signifi- 
cantly sensitive because the growth defect in drug is greater than the growth 
defect in vehicle (see vps3A in Fig. 10.5). Similarly, MSP assay results can be 
confirmed using these growth assays by comparing the growth of over- 
expression strains to that of the control in the conditions of interest. 

Occasionally, strains fail to confirm. This may be due to biological or 
technical reasons. For example, cross-contamination can occur between wells 
on the agar plate where the deletion collection is stored, or on the plate used for 
the actual confirmation screen, arguing for careful handling of the strains. To 
test for cross-contamination and to verify a strain's identity, deletion strains can 
be tested using PCR after streaking for single colonies (http: //www-sequence. 
stanford.edu/group/yeast_deletion_project/deletions3.html) . For homozy- 
gous deletion strains, it is necessary to test for both the presence of 
the deletion cassette at the intended locus as well as for the absence of a 
wild-type copy of the ORF. 
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Figure 10.5 Confirmation of microarray-detected growth sensitivity. Three deletion 
strains (erg4A, vps3A, and ypt7A) with micro array-inferred sensitivity to drug were 
tested by growing the strains in isolation. We used a Tecan Genios reader, with optical 
density readings taken every 15 min over 20 h. Each strain was grown in triplicate and 
compared to the wild-type growth both in vehicle (left panel) and in drug (right panel). 
Growth defects can be either severe (exemplified by vps3A) or mild (exemplified by 
erg4A). In this example, the sensitivity scored by microarray could be confirmed for 
vps3A and yptlA. Wild-type growth is represented by open circles and mutant growth 
by filled squares. Representative growth curves for the replicates are shown. Tecan 
ODs were converted to traditional "cuvette" ODs using the calibration function "real 
OD"=-1.0543 + 12.2716 x measured OD. 



In cases where the observed growth defects for deletion strains are minor 
or difficult to discern between drug and control, statistical tests can help to 
determine if the phenotype in question is significant or not. To perform 
these tests, three fitness scores (W) are derived for each of the mutants and 
conditions tested, using the three replicates of wild type and mutant (shown 
here for the first replicate only): 

W\ vehicle — doubling time of wt lve hi c i e /doubling time of mutant! ve hicie> 
where 1 = strain replicate 1 . 

J^idrug — doubling time of wt lc i r ug/ doubling time of mutant! d rug , 
where 1 = strain replicate 1 . 
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To obtain a p value for whether the difference between W observed in 
vehicle and drug is significant, a student's £-test can be performed on the 
two populations of three W values (using a two-tailed distribution, two- 
sample equal variance). The choice ofp value is arbitrary, but as a guideline, 
we typically require a p value of < 0.05 for the observed growth defect to be 
deemed significant. For example, performing these calculations, erg4A in 
Fig. 10.5 is not significantly sensitive to the drug tested, but vps3A and 
yptlA are. 

A W value below 1 indicates sensitivity of the mutant compared to wild 
type. To account for any growth defects due to the vehicle, a normalized 
fitness (NW) can be calculated as follows: 

NW = average ^ drU g/average W vehlcle 

A NW value below 1 indicates that the strain has a drug-induced growth 
defect. This NW 7 value, along with its p value as calculated above, allows for 
easy comparisons of phenotypes between drugs and strains, and can be 
reported as the "normalized fitness value" for the individual growth assays 
(Fig. 10.5). 




3. Experimental Considerations 

1. Choosing the appropriate culture volume and starting OD Choosing an 
appropriate starting culture volume and cell concentration is critical for 
obtaining good results and using at least 300 cells per strain is recom- 
mended. If the culture vessel will not accommodate the volume needed 
to reach the desired cell numbers, multiple cultures can be grown in 
parallel and pooled at the end of the experiment. 

2. Replicates Each experiment requires at least two arrays — a control sam- 
ple, and a treatment sample. We often use a large (more than 10) set of 
control arrays for analyzing many different experimental arrays, each 
with only one replicate. This control set can be used to calculate the 
statistical significance of the final results, and helps minimize the total 
number of experimental arrays needed. 

3. Comparing OD values between plates and cuvettes If cells are grown in a 
shaking spectrophotometer, note that the OD measured for each well in 
the plate will differ from the OD of the same culture when measured in a 
cuvette due to differences in path length. Similarly, OD readings in a 
shaking spectrophotometer will also vary with differences in culture 
volume. All OD values reported here refer to those measured in a 1 cm 
path-length cuvette. 

4. Checking for successful tag amplification Tag PCR products should be 
evaluated by agarose gel electrophoresis. The desired product is ~60 
bp. A second band is often seen because noncomplementary tags can 
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hybridize at their common primer regions, forming a partially single- 
stranded structure that migrates faster than the fully double-stranded tag 
products. It is also common to observe amplification in no-template 
control reactions, nonetheless these spurious bands do not adversely 
affect the results. 

5. Barcode sequences A full list of sequences is available (Smith et al, 2009). 

6. Array options These protocols use Affymetrix arrays; however, they can 
be easily adapted to the array platform of your choice. For alternative 
array examples, see Pan et al. (2004) and Yuan et al. (2005). 

7. Evaluating data quality and replicate sample agreement The most effective 
way to measure the quality of technical replicate samples is to measure 
the correlation of the log-transformed, normalized, and saturation- 
corrected intensity values. The correlation of these "processed raw 
values" should be at least 0.90 for replicates. For biological replicates 
grown and prepared separated in time, we typically observe correlations 
of about 0.7 (Ericson et al., 2008; Hillenmeyer et al., 2008). 




4. Perspectives 

Genome-wide chemogenomic assays have proven extremely valuable 
in determining gene function and the mechanism of action of drugs and 
small molecule probes. To date, the majority of such gene-dose assays have 
relied on homozygous and heterozygous deletions in yeast to probe loss-of- 
function effects, and a genomic library of clones to investigate gain-of- 
function effects. Recently the palette of available resources for such screens 
has expanded considerably, offering the promise of higher resolution gene- 
dose assays. These collections include; but are not limited to: 

1. DAmP alleles (Breslow et al, 2008; Yan et al, 2008) 

2. Ts alleles (Ben-Aroya et al, 2008) 

3. Systematic clone banks of ORFs (Ho et al, 2009) and genome fragments 
(Jones et al, 2008) 

In addition, the recently introduced barcoder technology (Yan et al, 
2008) will allow rapid barcoding of any strain collection for parallel analysis. 
The experimental rationale that underlies these screens is not limited to 
5. cerevisiae, indeed work from several groups has produced genome-wide 
collections for Escherichia coli (Baba et al, 2006; Kitagawa et al, 2005), 
Schizosaccharomyces pombe (http://pombe.bioneer.co.kr/), and human cells 
using interfering RNAs (Luo et al, 2008; Moffat et al, 2006). Each of the 
collections can be screened in a manner analogous to what we describe in 
this chapter. Experiments that use these collections will certainly expand 
our perspective on cellular physiology at a systems level. 
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Finally, it is important to note that the screening platform described here 
relies on high-density microarrays. Recent advances in high- throughput, 
"Next-generation" sequencing technologies promise to eventually displace 
microarrays in this regard. Toward this end, we have directly compared 
results obtained using a traditional microarray readout with next-generation 
sequencing (Smith et ah, 2009) and have found that the performance of 
next-generation sequencing, is comparable, and in some aspects, superior to 
microarrays. 
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Abstract 

Comprehensive analysis of yeast as a model system requires to reliably deter- 
mine its composition. Systematic approaches to globally determine the abun- 
dance of RNAs have existed for more than a decade and measurements of 
mRNAs are widely used as proxies for detecting changes in protein abundance. 
In contrast, methodologies to globally quantitate proteins are only recently 
becoming available. Such experiments are essential as proteins mediate the 
majority of biological processes and their abundance does not always correlate 
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well with changes in gene expression. Particularly translational and post-trans- 
lational controls contribute majorly to regulation of protein abundance, for 
example in heat shock stress response. The development of new sample 
preparation methods, high-resolution mass spectrometry and novel bioinfo- 
matic tools close this gap and allow the global quantitation of the yeast 
proteome under different conditions. Here, we provide background information 
on proteomics by mass-spectrometry and describe the practice of a compre- 
hensive yeast proteome analysis. 




1. Introduction 

A major goal in analyzing yeast as a eukaryotic model is to understand 
how components of the system interact dynamically and to determine the 
"wiring" of the interacting parts. A prerequisite for such analysis is the 
ability to reliably and globally determine the composition of yeast cells 
under different conditions. While methods to determine RNA quantita- 
tively and comprehensively, such as microarrays, have existed for more than 
a decade, the technology to globally determine changes of protein abun- 
dance is only recently becoming available. Proteins, however, constitute the 
majority of biologically active agents and information on their relative 
abundance is thus often crucial. Since changes in the amount of a protein 
are not always reflected in corresponding mRNA level changes (e.g., see 
Bonaldi et ah, 2008), it is essential for many experiments to measure them 
directly. This is particularly evident for regulatory processes that are 
mediated by posttranscriptional regulation affecting, for example, protein 
stability or production, such as heat stress. 

For these reasons, many techniques to determine the relative abundance of 
proteins have been developed: Most notably, Western blot or fluorescence 
measurements of tagged proteins are routinely used and comprehensive 
libraries containing most yeast open reading frames fused to GFP or the TAP 
tag have been developed (for a global analysis using these resources see, e.g., 
Ghaemmaghami et ah, 2003; Huh et ah, 2003). However, in some cases, these 
tagging methods may interfere with protein function as they rely on altering 
the protein sequence by introducing tags. For example, C-terminally modified 
(e.g., by lipidation on a CaaX box) or tail- anchored proteins are not functional 
in these libraries as the tags are usually introduced at the C-terminus of proteins. 
In addition, assays relying on libraries of tagged proteins are not easily applied to 
global experiments, especially when the goal is to compare multiple experi- 
mental conditions, because one experiment for each gene or about 6000 
individual experiments for all genes have to be performed for each condition. 

In addition to these techniques, mass spectrometry (MS) is used to 
identify single proteins, for example, purified and resolved by denaturing 



Yeast Expression Proteomics by High-Resolution Mass Spectrometry 261 

SDS polyacrylamide gel electrophoresis and this has become a standard tool 
in biochemistry. During the last 5 years, accelerating advances in MS 
technology, sample preparation, and computational proteomics led to the 
development of capabilities to comprehensively determine the relative 
amount of proteins in complex mixtures. Particularly the advent of preci- 
sion proteomics due to new instrumentation, such as hybrid linear ion trap 
Fourier- transform mass spectrometers (e.g., the LTQ-Orbitrap) led to high 
mass resolving power (Mann and Kelleher, 2008). Resulting from 
these advances and the concomitant development of computational tools, 
"shotgun" approaches to sequence an increasingly high number of peptides 
from complex samples are now available (Cravatt et ah, 2007; Domon and 
Aebersold, 2006). The high mass resolution obtained in these experiments 
reduces the number of potentially false-positive assignments and increases 
the number of identifications from a single chromatographic run (Cox and 
Mann, 2008). Together these advances enabled us in 2008 to report the first 
determination of a complete eukaryotic proteome from Saccharomyces 
cerevisiae (de Godoy et ah, 2008). 

To perform such global proteome quantitation, we used a state-of-the- 
art proteomics analysis setup consisting of sample preparation to separate 
peptides by isoelectric focusing (IEF), separation of peptides by liquid 
chromatography (LC), and online-injection of eluting peptides to the MS 
by electrospray ionization (ESI). In this chapter, we provide background on 
MS methods and describe the principles and practices of such a proteomic 
experiment. We also discuss how measurement time will be reduced 
drastically from our original report by new developments in sample prepa- 
ration, novel MS instrumentation, and advanced computational methods. 
We expect that these developments together will in the future make 
expression proteomics of yeast a standard experiment for quantitative, 
comprehensive approaches. 




2. Background, Methods, and Applications 

2.1. The challenge 

The principle challenge of proteomics is to reliably identify and quantitate 
proteins in dauntingly complex mixtures, for example, in a protein extract. 
In the case of yeast, at least 4200 proteins are expressed under normal growth 
conditions (de Godoy et al, 2008; Ghaemmaghami et al, 2003; Huh et al, 
2003). Some fraction of these proteins is posttranslationally modified, for 
example, by phosphorylation, acetylation, or glycosylation, and this further 
increases the chemical complexity of the polypeptide mixture in an extract. 
At the moment not every protein with all modifications can be quanti- 
tated from a single experiment. However, several methods are successfully 
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employed to approach the problem. Initial experiments resolved proteins by 
sequential electrophoresis in two dimensions (2D-gel analysis), with each 
dimension fractionating the proteins based on a different principal charac- 
teristic of proteins (O'Farrell, 1975). Most commonly, IEF in one dimen- 
sion and separation by size in denaturing SDS gels in the second dimension 
is used. The first "proteome-scale" experiments using this technology 
identified roughly 150 yeast proteins (Shevchenko et ah, 1996). However, 
this method often leads to multiple spots for the same protein, different, for 
example, in their modification(s) (Fountoulakis et ah, 2004). In addition, 2D 
gels are strongly biased toward only the most abundant proteins. Thus 
identification of more than a few hundred different proteins is generally 
not feasible. Moreover, accurate quantitation is often not possible because of 
overlaying spots (Campostrini et ah, 2005). A further complication is that 
the technology by itself does not allow the identification of proteins 
observed and therefore it is usually is combined with another analytical 
method, such as MS or Western blotting. For these combined reasons, 2D 
gels have not developed into a comprehensive proteomics technology. 

In contrast, MS-based proteomics can unambiguously identify proteins 
in a very complex mixture with minimal prior separation. Because MS is a 
versatile tool that combines several unique capabilities; such as quantifica- 
tion of proteins from a cell or organism and characterization of important 
posttranslational modifications (PTMs) of proteins (e.g., phosphorylation) 
in addition to identification of individual proteins in a complex mixture, 
it has become the most important technology in proteomics today 
(Aebersold and Mann, 2003). 

In an MS-based proteomic experiment, complex protein mixtures are 
usually digested by a protease, yielding a mixture of tens of thousands of 
peptides with a range of abundance over more than four orders of magni- 
tude. Until recently, this complexity has limited the feasibility of "shotgun" 
approaches that aim to directly identify the peptides in the mixture. To deal 
with this problem, several methods to reduce the complexity of peptide 
mixtures for analysis were introduced. Traditionally, the starting material, 
that is, the yeast extract, is further fractionated using, for example, subcellu- 
lar fractionation or denaturing SDS gel electrophoresis. Peptides resulting 
from the digest of such fractions are then separated by reversed-phase LC 
right before being analyzed "online" in the MS. Our recent experience, 
however, has been that extensive fractionation and separation at the protein 
level leads to rapidly diminishing returns in terms of the number of identi- 
fied proteins (Bonaldi et ah, 2008; de Godoy et ah, 2008). Instead, several 
strategies for the fractionation of peptides after proteolytic digestion of 
protein mixtures have been designed. In one variation of this principle, 
termed multidimensional protein identification technology or "MudPIT," 
the resulting peptides are separated by strong cation exchange chromatog- 
raphy (Washburn et ah, 2001). In addition, some variant techniques with 
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mixed-anion/ cation beds also exist (e.g., see Motoyama et ah, 2007). 
Recently, an alternative method employing IEF of peptides in a combined 
stationary and liquid phase was developed and this was an important 
contribution to directly asses the protein composition of total yeast extract 
(de Godoy et ah, 2008; Hubner et ah, 2008). This protocol, which is 
described in detail below, uses immobilized pi strips to separate peptides 
by IEF and further resolves the peptides in the resulting fractions by LC to 
directly analyze them by MS. 

2.2. Background on MS instrumentation 
for "shotgun" proteomics 

MS is essentially a technique for weighing molecules, but the measurements 
are not performed with a conventional balance or scale. Instead, in MS 
gas phase ions of peptides are separated or filtered according to their 
mass-to-charge (m/z) ratio in a magnetic or electrostatic field and finally 
recorded by a detector. The resulting mass spectrum is a plot of the relative 
abundances of the produced ions as a function of their m/z ratio (see 
Fig. 11.2). 

Because every peptide molecule and modification has a characteristic 
mass, MS is a very powerful and nearly universal tool in proteomics and can 
result in determination of the chemical composition when mass accuracy is 
sufficiently high. Peptides furthermore have distinct fragmentation patterns 
that provide structural information to identify their amino acid sequences 
and modifications. 

MS instrumentation developments have greatly contributed to recent 
breakthroughs in proteomic research. Several types of MS are currently 
employed in proteomics. They are distinguished by the ionization method 
used to charge peptides and by the type of mass analyzer used to determine 
the mass-to-charge (m/z) ratio of the resulting ions. 

As traditional ionization methods such as chemical ionization are often 
too harsh, "soft" methods that allow ionization of intact biomolecules are 
necessary for MS-based proteomics. The two ionization methods employed 
in proteomics are matrix assisted /aser Resorption conization (MALDI) and 
ESI. ESI, which is used most commonly, allows large, nonvolatile mole- 
cules such as peptides and proteins to be ionized nondestructively directly 
from a liquid phase, usually consisting of a mixture of volatile organic 
solvent and acidified water. In electrospray, a liquid is passed through a 
nozzle to which a high voltage is applied (Fenn et ah, 1989; Whitehouse 
et ah, 1985). The charged liquid becomes unstable as it is forced to hold 
more and more charges. Soon the liquid reaches a critical point and near the 
tip of the nozzle it blows apart into a cloud of tiny, highly charged droplets. 
These droplets rapidly shrink as solvent molecules evaporate from their 
surface increasing the electric field at the droplet surface. By a process of 
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"ion evaporation" (Iribarne and Thomson model) or simple solvent 
evaporation (charged residue model), the "naked" biomolecule becomes a 
gas-phase ion (Iribane and Thomson, 1976). 

The other "soft" ionization technique, MALDI, was also developed 
in the late 1980s (Hillenkamp and Karas, 1990; Hillenkamp et ah, 1991; 
Karas and Hillenkamp, 1988). In this technique, analyte molecules are 
cocrystallized with an UV- or IR-absorbing substance — termed the 
matrix — which is usually an organic carboxylic acid such as 2,5-dihydrox- 
ybenzoic acid (UV-absorbing) or succinic acid (infrared absorbing). 
The analytes are desorbed and ionized by a laser beam (pulsed laser irradia- 
tion) from the solid or liquid surface containing the organic matrix com- 
pound in approximately 1000-fold excess. A widely accepted view how the 
matrix assists the ionization is that neutral sample molecules are ionized by 
acid— base proton transfer reactions with the protonated carboxylic acid 
matrix ions in a dense phase just above the surface of the matrix. 

Like ESI, MALDI is capable of efficiently ionizing large biomolecules 
such as peptides and proteins and is often used with time-of^/Iight (TOF) 
MS (see below) due to the vacuum-compatibility and pulsing nature of the 
technique (the laser pulse frequency can easily be synchronized with the 
TOF extraction pulse). Both ESI and MALDI ionization allow introduction 
of biological molecules exceeding one million Daltons into MS, but they 
are by far most often used for analysis of peptides. For this purpose, the main 
difference between the two methods is that ESI predominately produces 
multiply charged ions, MH^ + . In contrast, MALDI almost exclusively 
generates singly charged peptide ions, MH , which can be difficult to 
sequence by the low-energy dissociation methods available on most pro- 
teomic mass analyzers; because when the single proton is fixed on the side 
chain of an arginine or lysine residue then there is no mobile proton 
available to induce peptide— amide bond fragmentations. 2D-gel-based pro- 
teomics is almost exclusively coupled to MALDI-TOF MS analysis, 
whereas most other areas of proteomics are increasingly based on ESI, 
since it is possible to integrate ESI with online LC-MS/MS. 

After ionization, the mass of peptide ions is determined. The earliest 
analyzers used in MS-based proteomics combined a series of quadrupoles, 
each capable of selecting specific ions that can pass an applied electric field 
that deflects ions with other ml z ratios: In the first quadrupole, peptides are 
filtered (MS spectrum). In the second quadrupole, one filtered peptide at a 
time is fragmented at the peptide bond by collision with noble gases such as 
Argon or Helium and this is commonly called collision-mduced dissociation 
("CID"). Subsequently to the collisions, the resulting spectrum of the 
fragmented ions is filtered in the third quadrupole (MS/MS spectrum). 
Information about the sequence of the peptide analyzed is then contained 
in the mass difference between series of fragmented ions in this MS/MS 
spectrum. 
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As an alternative to CID, another fragmentation method, electron transfer 
dissociation ("ETD") has been employed in the last few years. In this 
method, electrons are transferred from radical anions to the positively multi- 
ply charged peptide ions, which then fragment adjacent to the amino group 
of the peptide bond (for review, see Mikesh et ah, 2006). CID sometimes 
yields incomplete spectra, especially for very basic peptides. ETD often yields 
more uniformly fragmented peptides and in addition, the fragmentation is 
more specific for the peptide backbone and therefore PTMs, such as phos- 
phorylations are less likely to be lost in the spectra of peptide-backbone- 
derived fragments. These different fragmentation methods can in principle 
be coupled with any type of MS instrument. 

The classic setup consisting of triple quadrupoles described here is in 
principle very fast, but its mass accuracy (usually limited to 0.5 Da) and 
sensitivity are not high enough for experiments aimed at large-scale dis- 
covery of proteins in complex mixtures. Therefore, they are currently 
mainly employed in multiple reaction monitoring (MFJV1) experiments, 
where the fast MS /MS switching capability is used to detect a few prepro- 
grammed fragmentation patterns for a number of predefined peptides, with 
the goal of accurate and targeted quantitation in complex mixtures 
(Anderson et ah, 2009). 

As an alternative instrument type, TOF MS measure the travel time of 
ions to the detector after they have all been accelerated to the same kinetic 
energy. This time is directly proportional to the mass-to-charge ratio (m/z), 
which can be quite accurately measured. Drawbacks of these instruments 
include that very high resolution and sensitivity are difficult to achieve at the 
same time. 

A mass analyzer type currently used more commonly in proteomics is the 
ion trap. The basic principle of ion traps is similar to that of quadrupoles, with 
the exception that selected ions are trapped in the electric field and can be 
accumulated over time. This makes this instrument highly sensitive, but similar 
to quadrupole analyzers, its resolution is quite limited, sometimes leading to a 
mass uncertainty of several daltons. To overcome some of these drawbacks of 
conventional 3D-ion traps, a new generation of ion traps with superior ion 
capacity, dynamic range, scan speed, and sensitivity has been introduced. These 
are the linear ion traps (or 2D-ion traps) — essentially segmented quadrupole 
mass filters — capable of trapping and detecting a factor hundred more ions than 
traditional 3D -ion traps (Hager and Yves Le Blanc, 2003; Schwartzes/., 2002). 

A major breakthrough in proteomics was the introduction of a "hybrid" 
MS, the LTQ-Orbitrap consisting of a linear ion trap and an orbitrap 
(Scigelova and Makarov, 2006; Fig. 11.1). The orbitrap is the first funda- 
mentally new mass analyzer in more than 20 years. The instrument contains 
three components. It has a linear quadrupole ion trap (LTQ), in which it is 
possible to control and manipulate (e.g., accumulate and collisionally acti- 
vate) ions in the subsecond time-scale. Detection can be achieved in two 
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Figure 11.1 Schematic representation of typical tandem mass spectrometers for pro- 
teomics experiments. (A) Principal setup of a tandem mass spectrometer. Peptides are 
typically separated by liquid chromatography (LC) up-front and transferred to the gas 
phase in the ion source (by either MALDI or ESI). Peptide ions of interest can be 
separated or isolated in the first mass analyzer (either an ion trap or a quadrupole) and 
injected into the collision cell. The resulting fragmentation ions are analyzed in a 
second mass analyzer, for example, an ion trap or a TOF and recorded by electron 
multiplier detectors or by induction in Fourier-transform instruments. (B) Schematic 
overview of the LTQ-Orbitrap Velos. The front end of the instrument is a dual linear 
ion trap mass spectrometer capable of efficient ion accumulation, isolation, fragmenta- 
tion, and detection of MS or MS n ions. Accumulated ion populations are moved into 
the C-trap via an octapole ion guide, or for higher energy dissociation accelerates into 
the HCD collision cell and resulting fragments are subsequently moved back to the C- 
trap. In the C-trap, the motion of the ion population is damped by a residual pressure of 
nitrogen. Ions are then injected into the orbitrap in a short pulse and begin to circle the 
central electrode. The ion signals (peptide m/z values) are detected via a differential 
amplifier between the two halves of the outer orbitrap electrodes. 



ways. In the linear ion trap, ions can be ejected radially through slits in 
the quadrupole rods and detected by two electron multiplier detectors. 
Alternatively, ions are ejected axially from the ion trap and transferred via 
octopole-ion guides into another ion trap (the C-trap) where they are 
collisionally cooled and focused, before they are orthogonally injected into 
the third component of the instrument, the Orbitrap mass analyzer, which 
operates in very high vacuum. The LTQ-Orbitrap instrument is particularly 
suitable for both qualitative and quantitative analysis of complex peptide 
mixtures, because of its high sensitivity, dynamic range, mass accuracy, and 
sequencing speed. This allows for sequencing of thousands of peptides by 
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high-resolution tandem MS in less than 1 hour of LC-MS/MS analysis time. 
Due to these advantages that allow collection of spectra with very high 
resolution (60,000) and routinely with low parts-per-million mass accuracy 
this LTQ-Orbitrap instrument is used for all MS analyses described below. 

2.3. Quantitative proteomics 

In many proteomic experiments, the goal is not only to test the presence of a 
specific protein, but to quantitate its abundance as well. To date, most 
approaches rely on relative quantitation between different conditions. In a 
"label-free" approach, the integrated intensity of peptide peaks is compared 
between different experiments and used as a measure of protein abundance. 
However at this point, methods that compare intensities of peptides in the same 
LC-MS run are more reliable. Several methods exist that specifically label 
peptides of one condition during the proteolytic cleavage, for example, by 
use of water containing "heavy" O, which is incorporated at the C-terminus 
of each peptide (Yao et ah, 2001). Alternatively, chemical labeling of protein 
mixtures can be employed. Affinity tags with different mass are sometimes used 
and this technique was termed "Isotope-coded affinity fag" assay ("ICAT", 
Gygi et ah, 1999). Most commonly, thiol-reactive reagents are used to cova- 
lently link an isotope labeled affinity tag (e.g., biotin) to cysteine containing 
peptides. These are subsequently affinity purified and analyzed by MS. When 
samples are mixed after cross-linking with differently labeled reagents, they can 
be distinguished afterwards by their mass difference. Disadvantages of this 
method are that they introduce an additional processing step that may have 
different efficiency in different samples and that comparatively few tryptic 
peptides contain a cysteine that can be modified. For these reasons, ICAT is 
not generally used today. Instead, iTRAQ labeling of amine groups has 
become popular (Ross et ah, 2004). In this technique, peptides are labeled 
with up to eight different isobaric tandem mass tags, mixed and analyzed by 
LC-MS/MS. Each tag consists of a reporter and balance group, which has been 
designed such that it is very prone to fragmentation under CID. The technique 
is based on chemically tagging the free N-terminus and lysine-£-amino group 
on peptides generated from protein digests that have been isolated from 
different cell states. The tagged samples are then combined and in the full 
scan spectra, peptides from the different conditions appear at the same ml z 
ratio. Upon fragmentation, however, different reporter ions are released from 
the different iTRAQ tags and their relative amount is then quantitated to 
calculate the relative contribution of peptides from each condition to the 
intensity signal of the protein. 

For yeast cells, metabolic labeling is the preferred method for compara- 
tive proteomics since it is easily done with amino acids containing stable 
nonradioactive isotopes incorporated in a cell culture and results in highly 
uniform and efficient labeling, ("SILAC," Ong et al, 2002). Conveniently, 
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amino acids are chosen that ensure that one and only one labeled residue is 
present in each peptide. Trypsin is a commonly used protease to digest 
protein mixtures and it cleaves at the carboxyl side of arginine and lysines. 
Thus, when proteins are labeled with these two amino acids (with [13C6/ 
15N2] L-lysine and [13C6/15N4] l- arginine, respectively) and digested 
with trypsin, each resulting peptide contains one labeled amino acid. Anal- 
ogously, the endoproteinase LysC that cleaves after lysines is used on 
proteins labeled only with lysine. For comparative proteomics, both the 
labeled and unlabelled proteins are treated the same way; in fact most easily, 
they are mixed together in equal ratio of total protein and processed 
together. The resulting peptides are easily distinguished in the MS by 
their characteristic mass difference and computational proteomics software 
automatically identifies and quantifies corresponding SILAC pairs, giving an 
accurate ratio of light and heavy peptides, which can then be averaged for 
protein ratios. This analysis is not limited to two different isotope forms 
of peptides and a third label is often used. However, each labeling increases 
the complexity of the mixture and therefore complicates the comprehensive 
analysis of all peptides contained in a mixture. 

A complication of metabolic labeling of yeast by heavy isotope containing 
amino acids is potential conversion between each other due to coupled amino 
acid synthesis pathways. In practice, only conversion from arginine to proline is 
common (Ong et al., 2003). Depending on the cell type and sequence of each 
peptide (the number of prolines contained) this can result in variable and 
complex patterns of SILAC peptide peaks, preventing accurate automated 
analysis. To avoid this problem, one possibility is to reduce the amount of 
labeled arginine added to the synthetic medium and another one is to omit 
arginine labeling altogether and to use LysC instead of trypsin for the digest. 
This latter approach has proven very powerful especially in yeast. It results in 
peptides of increased average length, but concomitantly reduces the complexity 
of the mixture. In addition, lysine labeling of yeast cells is easily achieved since 
many strain backgrounds are lysine auxotrophs. For example, the commonly 
used BY4739 strain used to derive the MAT alpha gene deletion collection 
bears the lys2A0 allele, is lysine auxotroph and can therefore be directly labeled. 
(Giaever et al, 2002; Winzeler et al, 1999). Other commonly used lab strains, 
such as W303, can easily be made auxotroph by deleting LYS2 and ARG4 
when double labeling with arginine and cleavage with trypsin is desired. 

This in vivo labeling approach can be used to compare different condi- 
tions and Fig. 11.2 shows a representative workflow for such an experiment. 
In this way, we compared the complete proteome of haploid and diploid 
yeast cells. Of course the same logic can be applied to cells grown under 
different conditions, cells with a different genotype (e.g., harboring a 
deletion) or different biochemical fractions. For example, in a variation of 
this protocol, protein complexes purified from heavy lysine labeled cells by 
affinity chromatography are directly compared to background resulting 
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Figure 11.2 Workflow of a SILAC experiment in yeast. 1 . Yeast cultures are grown in 
either light or heavy amino acid containing medium. 2. After harvesting, proteins are 
digested with a protease to yield peptides. 3. These are subsequently separated by 
an analytic method that is orthogonal to reversed-phase chromatography (such as 
isoelectric focusing or ion exchange chromatography methods) prior to LC-MS analy- 
sis. Here, isoelectric focusing on immobilized pJ strips is shown as an example. 4. The 
fractions of this separation are then analyzed by LC-MS/MS. In full scan spectra, the 
same peptide from different cell populations is quantified by the relative difference 
in intensities between the SILAC labels. Peptides are subsequently sequenced and 
identified by MS/MS. 
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from unspecific binding of proteins from light labeled cells to a control 
matrix (Vermeulen et al, 2008). Similarly, many other variations of this 
general principle are possible. Examples also include the measurement of 
turnover of proteins (Doherty et al, 2009). For such an experiment, cells 
are switched from medium containing "light" amino acids to medium 
containing "heavy" amino acids for defined times. The change of ratio 
over time will then indicate the time course of protein turnover, measured 
individually for each protein. 

2.4. Computational proteomics and data analysis 

The streamlining of acquisition and the large amount of data necessitate 
efficient and automated evaluation of the resulting spectra. For example, in 
a typical proteomic experiment, at least 12 fractions from the IEF of peptides 
are analyzed, each by an LC run of at least 120 min, collecting a spectrum 
every second. Together, these runs result in at least 86,400 MS spectra of high 
mass accuracy that need to be evaluated. In addition, the most abundant peaks 
(usually the top five in abundance) of every MS scan are fragmented and the 
MS /MS spectra are collected, adding further data to be evaluated. A major 
breakthrough in computational proteomics was recently achieved in the 
MaxQuant software suite (Cox and Mann, 2008). The algorithms use corre- 
lation analysis and graph theory to detect peaks and isotope clusters in the MS, 
using the ml z ratio, intensity, and LC elution time as parameters. Figure 11.3 
shows a plot with every detected peptide plotted as its mlz versus its elution 
time in green. Successful identifications are shown in purple. As can be easily 
appreciated, the identification rates are very high, usually resulting in identi- 
fication of the majority of peptides. This information is automatically submit- 
ted to a commercial search engine (Mascot). An important consideration in 
computational proteomics is the control of the accuracy of peptide identifi- 
cation. Today, 99% certainty of protein identification is usually desired and 
this is controlled by monitoring the number of identifications in a "nonsense" 
or decoy database consisting of reversed protein sequences (Elias and Gygi, 
2007). This search results in automatic identification of peptides in the 
mixture with high confidence. In the next step, peak intensities for each 
member of a SILAC pair are calculated from the isotope pattern. Multiple 
measurements for each peptide are integrated and statistically evaluated, 
resulting in a measurement of the abundance ratio of the proteins in each 
sample and a confidence estimate for that measurement. 

2.5. Perspective and outlook 

While total proteome quantitation is still a time-consuming experiment 
requiring advanced instrumentation and specialized know-how, several 
trends will make such experiments much more routine and available to 
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Figure 11.3 Workflow of the analysis of a proteomic experiment. Complex LC-MS 
Raw files containing MS and MS/MS spectra collected from the LTQ-Orbitrap is fed 
into the MaxQuant software package that automatically searches fragmentation pat- 
terns against a target/decoy protein database. Currently, identified peptide sequences 
are filtered based on their database score and accepted at a false discovery rate (FDR) of 
less than 1%. MaxQuant then automatically determines the peptide ratios of SILAC 
pairs and calculates the significance of regulation for all identified proteins. 
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molecular biology laboratories in the future. Among them, the develop- 
ment of a new generation of hybrid MS instruments named "LTQ- 
Orbitrap Velos,' 5 which offers significantly higher sensitivity due to new 
ion optics systems that enhances the transfer of ions from the source to 
the MS by an order of magnitude (Olsen et al, 2009). Similarly, this 
instrument enables faster scan cycle times at higher performance with a 
new dual pressure linear ion trap. Together this improves the scan speed 
more than twofold, which in effect will reduce MS measurement time 
requirements for yeast expression proteomics. In addition to developments 
in MS instrumentation, novel preparation methods (such as "filter aided 
sample preparation"; Wisniewski et ah, 2009) result in more uniform 
samples between experiments and therefore even higher reproducibility 
and identification rates. Finally, streamlined versions of MaxQuant and 
novel bioinformatic tools will result in faster evaluation of the data, a process 
that is still computationally intensive and time consuming. 

With these enhancements, expression proteomics by MS is likely to 
continue to become more widespread and will soon be a common tech- 
nique to comprehensively analyze the composition of yeast cells. 




3. Protocols 

In the following, we describe in detail how to grow SILAC labeled 
yeast for a proteomic experiment, how to process the samples for LC-MS, 
how to perform the measurements and how to analyze the resulting data. 



3.1. Yeast strains for SILAC proteomics experiments 

In principle, any yeast strain that is auxotroph for lysine and/ or argenine is 
suitable for SILAC labeling, depending on which label is used. If a particular 
stain needs to be used, either LYS2 or ARG4 can be introduced either 
directly by transformation with a PCR product (Janke et al., 2004) or by 
crossing with a deletion strain such as BY4739. Primers to delete LYS2 
using the system described by Janke et al. (2004) are Lys2-Sl (5 7 -atttcagtga 
aaaactgcta atagagagat atcacagagt tactcactaa tgcgtacgct gcaggtcgac-3 7 ) and 
Lys2-S2 (5 7 -ctaattcat atttaattat tgtacatgga catatcatac gtaatgctca accttaatcg 
atgaattcga gctcg-3 7 ); conversely, for ARG4, Arg4-Sl (5 7 -cgcaattgaa gagct- 
caaaa gcaggtaact atataacaag actaaggcaa acatgcgtac gctgcaggtc gac-3 7 ) and 
Arg4-S2 (5 7 -gtcctagaag taccagacct gatgaaattc ttgcgcataa cgtcgccatc 
tgctaatcga tgaattcgag ctcg-3 7 ) primers are used. 
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3.2. Media for SILAC labeling 

Stock medium (any other drop out mix lacking arginine and lysine will 
work) : 



XYNB without amino acids 


6.7 


g/1 




XGlucose 


20 


g/1 






Final 


Stock per 




Amount of 




concentration 


100 ml 




stock 




[mg/1] 






[ml] for 1 1 


Adenine sulfate 


20 


200 mg 




10 


Uracil 


20 


200 mg 




10 


L-Tryptophan 


20 


lg 




2 


L-Histidine-HCl 


20 


lg 




2 


L-Arginine— HCr 


20 


lg 




2 


L-Methionine 


20 


lg 




2 


L-Tyrosine 


30 


200 mg 




15 


L-Leucine 


60 


lg 




6 


L-Isoleucine 


30 


lg 




3 


L-Phenylalanine 


50 


lg 




5 


L-Glutamic acid 


100 


lg 




10 


L-Aspartic Acid 


100 


lg 




110 


L-Valine 


150 


3g 




5 


L-Threonine 


200 


4g 




5 


L-Serine 


400 


8g 




5 



ONLY for lysine labeling alone. 

To prepare media ready to use (use light/medium/heavy amino acids as 
desired) : 



L-Lysine 
L-Argenine a 



30 
5 



mg/1 
mg/1 



ONLY for double Lys/Arg labeling; concentration of arginine 
can be varied up to 20 mg/1; this low amount of 5 mg/1 
minimizes arginine to proline conversion. 



SILAC amino acids (Isotec-Sigma): 



Lys 4: 
Lys8: 
Arg 6: 
Arg 10: 



L-Lysine — 4,4,5,5-d 4 - CI 



13 



15- 



L-Lysine — C6, N 2 -C1 



13, 



L-Arginine — C 



L-Arginine- 



13 



CI 

15 



C*, 13 NL-C1 



Cat.# 616192 
Cat.# 608041 
Cat.# 643440 
Cat.# 608033 
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3.3. Growing yeast cultures for SI LAC labeling 

1. Grow preculture of a lysine auxotroph strain over night in SC medium 

2. Inoculate SILAC medium (either heavy or light) with a 1:10,000 
dilution of the preculture 

3. Grow overnight at 30 °C 

4. Check OD next morning (cells should reach OD = 0.5—0.7) 

5. To harvest, spin down cells 10 min 4000 rpm 4 °C 



3.4. Extract Preparation for SILAC experiments 

1. Resuspend cells in buffer S (150 mM K acetate, 2 mM Mg acetate, 
lx protease inhibitor cocktail (Roche), 20 mM HEPES, pH 7.4) at a 
density of 50 OD/ml (minimum buffer volume for the bead-mill (3 ml)) 

2. Slowly drop cell suspension in N2(l) 

3. Grind cells in MM301 Ball Mill (Retsch), three cycles of 3 min at 10 Hz. 

4. Thaw grinded cells 

5. Detergent extraction (30 min; 1% TritonX 100, 4 °C rotating) 

6. Spin down 10 min 1000 rpm 4 °C 

7. Collect supernatant 

8. Measure protein concentration by Bradford assay 

9. Mix extracts from strains that are used for the comparison in a 1 : 1 ratio of 
protein amounts 



3.5. In-solution digest of proteins for MS 

1 . Reduce proteins for 20 min at RT in 1 mM dithiothreitol (DTT) 

2. Alkylate proteins for 15 min using 5.5 mM iodoacetamide (IAA) at 
RT in the dark 

3. Digest with the endoproteinase Lys-C (Wako) using 1:50 w/w over 
night at RT (arginine and lysine labeled yeast proteins are digested with 
Lys-C in a similar manner) 

4. Dilute the resulting peptide mixtures with Millipore water to achieve a 
final urea concentration below 2 M 

5. For argenine labeled cells, add trypsin (modified sequencing grade, 
Promega) 1:50 w/w and digest overnight 

6. Trypsin and Lys-C activity are quenched by acidification of the reaction 
mixtures with TFA to pH ~2. 



3.6. Test of label incorporation 

It is straightforward to calculate the SILAC incorporation efficiency of a 
peptide by high-resolution MS. To this end, a small aliquot of yeast cells 
grown in the presence of either [ C 6 ]arginine or [ C 6 ]lysine for several 
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generations is analyzed by MS. Proteins are extracted from the sample 
as described above and separated by one-dimensional gel electrophoresis 
(SDS-PAGE). The gel is Coomassie stained and a slice corresponding 
to 30—40 kDa protein size range is excised and digested in situ with 
LysC (or alternative trypsin for arginine labeling). The resulting peptide 
mixture is desalted and analyzed by nanoflow LC-tandem MS. The raw 
MS data can be analyzed in the MaxQuant software suite as described 
below. As an output, MaxQuant will generate a list of identified and 
quantified proteins. The median (H/L) peptide and protein ratio for all 
proteins directly reflects the SILAC amino acid incorporation rate. In 
order to allow accurate quantitation of ratios in the subsequent 
proteomic experiment, the ratio indicating labeling efficiency should be 
at least 99%. 



3.7. Peptide I EF (Optional) 

Alternative to the peptide isoelectric focusing described here, other peptide 
fractionation procedures yield good results. For example, anion exchange 
fractionation can be used as described in Wisniewski et ah (2009). 

1. To separate peptides according to their isoelectric point, 75 fig of 
in-solution digested peptides are fractionated using the Agilent 3100 
OFFGEL fractionator (Agilent, G3100AA). 

2. Set up the system according to the manual of the High Res Kit, pH 3-10 
(Agilent, 5188-6424) but exchange strips by 24 cm Immobiline Dry- 
Strip, pH 3-10 (GE Healthcare, 17-6002-44) and ampholytes by IPG 
Buffer, pH 3-10 (GE Healthcare, 17-6000-87). 

3. Rehydrate strips for 20 min with 20 fi\ rehydration buffer per well 
containing 5% glycerol and ampholytes diluted 1:50. 

4. Prepare 150 fA of peptide solution containing 3.125 fig yeast digest, 5% 
glycerol, and ampholytes diluted 1:50. 

5. Apply mixture to each well of the OFFGEL device. 

6. Close wells with a silicon cover seal to prevent evaporation of liquid. 

7. Focus peptides for 50 kVh at maximum current of 50 fiA, maximum 
voltage of 8000 V and maximum power of 200 mW into 24 fractions. 

Average run time is approximately 30 h. 

8. Add 3% acetonitrile, 1% trifluoroacetic acid, and 0.5% acetic acid to 
acidify each peptide fraction. 

9. Desalt and concentrate fraction on a reversed-phase CI 8 StageTip 
(Rappsilber et ah, 2007). 
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3.8. MS analysis 

3.8.1. Equipment 

We perform all MS experiments on a nanoflow high-performance liquid 
chromatography (HPLC) system (Agilent Technologies 1200, Waldbronn, 
Germany) connected to a hybrid LTQ-Orbitrap classic or XL (Thermo 
Fisher Scientific, Bremen, Germany) equipped with a nano electro spray 
ion source (Proxeon Biosystems, Odense, Denmark). The HPLC system 
consists of a solvent degasser nanoflow pump and a temperature controlled 
microautosampler kept constantly at 4 °C in order to reduce sample 
evaporation. The peptide mixtures are loaded onto a 15 cm analytical 
column (75 /im inner diameter) packed in-house with a methanol slurry 
of 3 /im reverse-phased, fully end-capped C18 beads (Reprosil-AQ Pur, 
Dr. Maisch) using a pressurized "packing bomb" operated at 50—60 
bar. Mobile phases for HPLC consist of (A) 99.5% Milli-Q water and 
0.5% acetic acid (v/v); (B) 19.5% Milli-Q water, 80% acetonitrile, 
and 0.5% acetic acid (v/v). 

3.8.2. Procedure of sample preparation/injection 

1. Prior to MS analysis elute all samples of the CI 8 StageTip directly into a 
96 sample well plate (Abgene, UK) using two times 20 fA buffer B (80% 
acetonitrile, 0.5% acetic acid). 

2. Concentrate samples in a "speed- vac" for 12 min in order to remove all 
organic solvent. 

3. Adjust sample volume to approximately 8 fA by adding an appropriate 
volume of buffer A (0.5% acetic acid). 

4. Load 5 fA of prepared peptide mixture onto the analytical column for 
20 min in 2% buffer B at a flow rate of 500 nl/min followed by reverse- 
phased separation through a 90 min gradient ranging from 5% to 40% 
acetonitrile in 0.5% acetic acid. 

5. Wash the column for 10 min with high concentration of organic solvent 
(90% buffer B) and equilibrate it for another 10 min with buffer A (0.5% 
acetic acid) prior to loading of the next sample. 

6. The eluted peptides from the HPLC column are directly electro sprayed 
into the MS for detection. 



3.8.3. Mass spectrometry 

We operate the MS instrument in data-dependent mode by automatically 
switching between full survey scan MS and consecutive MS /MS acquisi- 
tion. Survey full scan MS spectra (mass range ml z 300—2000) are acquired in 
the orbitrap section of the instrument with a resolution of R = 60,000 at ml 
z 400 (after accumulation to a "target value" of 1,000,000 in the linear ion 
trap). The 10 most intense peptide ions in each survey scan with an ion 
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intensity above 500 counts and a charge state > 2 are sequentially isolated to 
a target value of 5000 and fragmented in the linear ion trap by collisionally 
induced dissociation (CID/CAD). 

All peaks selected for fragmentation are automatically put on an exclu- 
sion list for 90 s, which ensures that the same ion would not be selected for 
fragmentation more than once. For optimal duty cycle, the fragment ion 
spectra are recorded in the LTQ-MS "in parallel" with the orbitrap full scan 
detection. For all survey scan measurements with the orbitrap detector, a 
lock-mass ion from ambient air (m/z 391.284286, 429.08875, and 
445.120025) is used for internal calibration as described ensuring an overall 
sub-ppm mass accuracy for all detected peptides (Olsen et ah, 2005). 

For all MS experiments, data is saved in RAW file format (Thermo 
Scientific, Bremen, Germany) using the Xcalibur 2.0 with Tune 2.2 or 2.4. 
All data was loaded into the in-house written software MaxQuant and 
analyzed as described below. 

3.9. Identification and quantitation of peptides and proteins 

The data analysis is performed with the MaxQuant software as described in 
Cox and Mann (2008) supported by Mascot as the database search engine 
for peptide identifications. In addition, a step by step protocol for the analysis 
can be found in Cox et al. (2009). In short, peaks in MS scans are deter- 
mined as three-dimensional hills in the mass-retention time plane. They are 
then assembled to isotope patterns and SILAC pairs by graph theoretical 
methods. MS/MS peak lists are filtered to contain at most six peaks per 100 
Da intervals and searched by Mascot (www.matrixscience.com) against a 
concatenated forward and reversed version of the yeast ORF database 
(Saccharomyces Genome Database SGDTM at Stanford University — www. 
yeastgenome.org). Protein sequences of common contaminants, for exam- 
ple, keratins, were added to the database. The initial mass tolerance in MS 
mode was set to 7 ppm and MS/MS mass tolerance was 0.5 Da. Cysteine 
carbamidomethylation is searched as a fixed modification, whereas N-acetyl 
protein, N-pyroglutamine, and oxidized methionine are searched as vari- 
able modifications. Labeled arginine and lysine are specified as fixed or 
variable modifications, depending on the prior knowledge about the parent 
ion. The resulting Mascot.dat files are loaded into the MaxQuant software 
together with the raw data for further analysis. SILAC peptide and protein 
quantitation is performed automatically with MaxQuant using default set- 
tings for parameters. Here, for each SILAC pair, the ratio is determined by a 
robust regression model fitted to all isotopic peaks and all scans that the pair 
elutes in. SILAC protein ratios are determined as the median of all peptide 
ratios assigned to the protein. Absolute protein quantitation is based on 
extracted ion chromatograms (XICs) of contained peptides. To minimize 
false identifications all top-scoring peptide assignments made by Mascot are 
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filtered based on prior knowledge of individual peptide mass error, SILAC 
state, and the correct number of lysine and arginine residues specified by the 
mass difference observed in the full scan between the SILAC partners. 
Furthermore peptide assignments are statistically evaluated in a Bayesian 
model based on sequence length and Mascot score. We accept peptides and 
proteins with a false discovery rate (FDR) of less than 1%, estimated based 
on the number of accepted reverse hits. 
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Abstract 

Physical interactions mediated by proteins are critical for most cellular func- 
tions and altogether form a complex macromolecular "interactome" network. 
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Systematic mapping of protein-protein, protein-DNA, protein-RNA, and 
protein-metabolite interactions at the scale of the whole proteome can advance 
understanding of interactome networks with applications ranging from single 
protein functional characterization to discoveries on local and global systems 
properties. Since the early efforts at mapping protein-protein interactome net- 
works a decade ago, the field has progressed rapidly giving rise to a growing 
number of interactome maps produced using high-throughput implementations 
of either binary protein-protein interaction assays or co-complex protein asso- 
ciation methods. Although high-throughput methods are often thought to 
necessarily produce lower quality information than low-throughput experi- 
ments, we have recently demonstrated that proteome-scale interactome data- 
sets can be produced with equal or superior quality than that observed in 
literature-curated datasets derived from large numbers of small-scale experi- 
ments. In addition to performing all experimental steps thoroughly and includ- 
ing all necessary controls and quality standards, careful verification of all 
interacting pairs and validation tests using independent, orthogonal assays 
are crucial to ensure the release of interactome maps of the highest possible 
quality. This chapter describes a high-quality, high-throughput binary protein- 
protein interactome mapping pipeline that includes these features. 




1. Introduction 

Interactions mediated by proteins and the complex "interactome" 
networks resulting from these interactions are essential for biological 
systems. Mapping protein— protein, protein— DNA, protein— RNA, and 
protein— metabolite interactions that form "interactome" networks is a 
major goal of functional genomics, proteomics, and systems biology 
(Vidal, 2005). Information obtained from large-scale efforts to identify 
protein interaction partners yields crucial biological insights throughout 
a range of applications. At the single protein level, interactome maps 
have helped assign functions to both uncharacterized and well-studied 
gene products (Oliver, 2000). At the systems level, interactome maps have 
enabled investigations of how regulatory circuits and global cellular net- 
work properties relate to biological functions (Han et ah, 2004; Jeong et ah, 
2001; Milo et ah, 2002; Yu et ah, 2008). 

The two major high- throughput strategies used so far to delineate 
protein— protein interactome networks are: (i) binary protein— protein inter- 
action assays, which detect direct pairwise interactions, and (ii) affinity 
purification followed by mass spectrometry (AP— MS) approaches, which 
detect biochemically stable, copurifying protein complexes containing both 
direct and indirect protein associations. Classically, binary interaction assays 
have been based on the yeast two-hybrid (Y2H) system developed 20 years 
ago (Fields and Song, 1989), and which has been improved over time to 
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increase efficiency and quality (Durfee et ah, 1993; Gyuris et ah, 1993; Vidal 
et ah, 1996). Of late, alternative approaches have been developed to detect 
binary interactions, such as protein arrays, protein complementation 
assays, and the split ubiquitin method (Miller et ah, 2005; Tarassov et ah, 
2008; Zhuetah, 2001). 

Until recently high-throughput methods were regarded as more likely to 
produce lower quality information than low-throughput experiments. It has 
now been shown that highly reliable interactome datasets can be obtained at 
the scale of the whole proteome (Braun et ah, 2009; Cusick et ah, 2009; 
Simonis et ah, 2009; Venkatesan et ah, 2009) provided that all experimental 
steps are thorough and all necessary controls and quality standards are 
included. Lastly, careful verification of all candidate interactions and experi- 
mental validation using independent interaction assays are necessary to 
ensure the release of interactome maps of the highest possible quality. 

Even when highly reliable, interactome maps should be considered as 
network models of interactions that can happen between all proteins 
encoded by the genome of an organism of interest. As such, they correspond 
to static representations of collapsed time-, space-, and condition-dependent 
interactions that dynamically regulate the behavior and developmental fate 
of diverse tissues. Thus, interactome maps should be used as static scaffold- 
like information from which the dynamic features of biologically relevant 
interactions, that is, those that do happen in vivo, can be modeled by 
integrating additional functional information such as transcriptional and phe- 
notypic profiling data (Ge et ah, 2001, 2003; Gunsalus et ah, 2005; Vidal, 
2001). Ultimately, novel potentially insightful interactions need to be eval- 
uated for their biological significance using genetic experiments, where 
specific as-acting interaction-defective alleles (ID As) of one or both proteins 
or trans-acting disruptors are tested functionally (Dreze et ah, 2009; Endoh 
et ah, 2002; Vidal and Endoh, 1999; Vidal et ah, 1996; Zhong et ah, 2009). 



2. High-Quality Binary Interactome Mapping 

The quality of any dataset can be affected by a high rate of "false 
positives" and need to be addressed in two fundamentally different contexts. 
One relates to avoidable experimental errors leading to wrong information, 
and the other relates to as yet undiscovered fundamental properties of 
proteins (Fig. 12.1). Our binary interactome mapping strategy is designed 
to differentiate between these two classes of issues designated "technical 
and "biological" false positives. 




j j 



• Technical false positives 

All techniques used to map protein interactions can give rise to artifacts. 
It goes without saying that artifacts or technical false positives should be 
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Test all pairwise combinations for 

possible physical interactions in the 

search space allowed by cloned 

ORFs availability (e.g., Y2H) 



Raw data 



Artifacts 
eliminated 



Specific controls for primary method 

(e.g., Y2H auto-activator removal and verifications) 



Potential interaction pairs 



False 
positives? 



Validation by orthogonal 
interaction assay confidence score 



Biophysical interaction network map 



\ i 



Orthogonal datasets, 
small-scale follow-up 



Pseudo-interactions Biological interactions 

Figure 12.1 General strategy to map binary interactome networks. All possible pairs 
of a search space are tested using a large-scale binary interaction detection assay such 
the yeast two-hybrid (Y2H) system. First-round positives constitute the raw dataset 
in which artifacts need to be identified and eliminated. The resulting set of putative 
interactions is then validated using alternative binary interaction detection assays. This 
step allows determination of the dataset precision or experimentally determined confi- 
dence scores for all individual interactions. Overlap of biophysical interactions with 
other types of datasets, such as coexpression or phenotypic profiles, or small-scale 
experimental follow-up, allows the identification of biologically relevant binary 
interactions. 

identified and removed as much as possible with appropriately designed 
experimental conditions and controls. Potential artifacts are different for 
every method and can arise systematically or sporadically. Often it takes 
several years of collective use, after the original description of a method, for 
systematic artifacts to be understood and thus become avoidable. 

In biochemical AP— MS experiments, or in the design and use of anti- 
bodies, nonspecific binding by abundant proteins or contaminant proteins 
introduced while carrying out experiments represent technical false posi- 
tives that can and should be removed. Y2H is based on a set of growth 
selections designed to identify the reconstitution of a transcription factor 
mediated by two hybrid proteins. Although powerful, the system needs to 
be carefully controlled because unrelated spontaneous genetic suppressors 
can appear during these growth selections. Such artifacts can reliably be 
removed by thorough implementation of the methods described below. 
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However careful the execution of Y2H mapping experiments or any 
other high-throughput methodology is, the precision of the obtained 
dataset (i.e., the inverse of the false discovery rate (FDR)) still needs to 
be determined to estimate both systematic and sporadic technical false 
positives that might remain undetected (Fig. 12.1). We advocate below 
further rigorous experimental verifications of all interacting pairs using 
the Y2H version used to produce a dataset, followed by careful validation 
using orthogonal protein interaction assays to determine overall quality. 
Once these steps have been implemented the result is a set of well-demon- 
strated interactions, proven to physically interact. We refer to such protein 
pairs as "biophysical interactors". 

• Biological false positives 

While it is plausible that most biophysical interactions are biologically 
relevant, their relevance, and the mechanism by which biophysically 
demonstrated protein interactions affect the physiology of an organism, 
remains to be demonstrated in subsequent, often laborious experiments 
(Fig. 12.1). It is theoretically possible that a subset of biophysical interactions 
might be biologically inconsequential because, among other possibilities, 
they remain either spatially or temporally separated throughout the lifetime 
of an organism. Such "pseudo-interactions" can be viewed as biological 
false positives that need to be eliminated or, alternatively, might represent 
interesting evolutionary remnants similar to the existence of pseudo-genes 
in many organisms (Venkatesan et ah, 2009). 

2.1. Production and verification of Y2H datasets 

Fields and Song (1989) first described the Y2H system as the reconstitution 
of a transcription factor through expression of two hybrid proteins, one 
fusing the DNA-binding (DB) domain to a protein X (DB-X) and the other 
fusing an activation domain (AD) to a protein Y (AD-Y). In the last 20 years 
much has been learned about possible artifacts and appropriate controls, so 
that today Y2H can be considered not only one of the most efficient, but 
also one of the most reliable binary interaction assays available for small-, 
medium-, and large-scale interaction mapping. We next discuss specific 
artifacts of the Y2H system and the appropriate controls developed to detect 
and remove them. 

2.1.1. Autoactivators 

A common artifact of the Y2H system is autoactivation of Y2H-inducible 
reporter genes. This occurs when DB-X (where X is a full-length protein or a 
protein fragment) activates transcription of Y2H reporter genes irrespective of 
the presence of any AD-Y. Three classes of autoactivators need to considered: 
(i) genuine transcription factors that contain a bona fide AD and consequently 
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will likely score as autoactivators when fused to DB, (ii) proteins that are not 
transcription factors in their natural context but can behave as autoactivators 
because they contain a cryptic AD (cognate autoactivators), and (iii) nontran- 
scription factor proteins that contain one or more cryptic ADs that are only 
functional as truncated fragments and not when expressed in the context of 
full-length proteins (de novo autoactivators). 

Both genuine transcription factors and cognate DB-X autoactivators can 
be identified and removed by performing prescreens for reporter gene 
activation either with AD expressed alone (i.e., in the absence of any Y 
fused protein) or even with no AD at all. 

De novo autoactivators are more difficult to detect than transcription 
factors and cognate autoactivators. The Y2H system is based on positive 
growth selections for potentially rare events, such as the finding of a single 
cDNA out of a complex library. The Y2H system can just as rigorously 
select for mutations that occur during the course of a screen and which 
convert a nonactivator protein into a de novo autoactivator. Such events are 
relatively frequent and some early Y2H datasets may have been inadver- 
tently overpopulated by spontaneous autoactivators (I to et al, 2001; Yu 
et al, 2008). A method to systematically remove these artifacts (Walhout 
and Vidal, 1999) employs a counter-selectable marker CYH2 present on 
the AD-Y coding plasmid together with control plates that contain cyclo- 
heximide (CHX). At every stage of the interactome mapping pipeline 
reporter gene activity is evaluated in parallel both on regular selective plates 
and on selective plates that contain CHX. The CYH2 marker allows 
the selection of yeast cells that do not contain any AD-Y and thus the 
convenient identification of DB-X autoactivators. 



2.1.2. Retesting to verify candidate interactions 

In addition to autoactivating mutations in the DB-X protein, other genetic 
changes can occur during a screen. Mutations of the full-length DB-X or 
AD-Y protein might permit interactions that are otherwise undetectable or 
inhibited. Other mutations, such as ds-acting mutations in reporter genes 
and trans-acting mutations at unlinked genetic loci, could lead to reporter 
gene activation in the absence of any physical interaction between DB-X 
and AD-Y. To identify and remove such artifacts, all interaction candidates 
are systematically verified using yeast transformants freshly thawed from 
DB-X and AD-Y archival stocks. Haploid yeast cells of opposite mating- 
type, each containing DB-X or AD-Y expression plasmids, are mated 
according to the interacting pairs identified in the original screens and are 
tested for reproducible Y2H phenotypes to confirm reporter gene activa- 
tion. Usually rsj 50% of interaction candidates can be successfully verified, 
which suggest that perhaps half of all primary Y2H positives belong to the 
classes of artifacts described above. 
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2.1.3. A high-quality Y2H implementation 

Besides the precautions already mentioned, the Y2H version we have 
developed presents the following features that ensure high data quality. 

Low DB-X and AD-Y hybrid protein expression The use of low copy 
number yeast expression vectors together with the presence of weak pro- 
moters expressing DB-X and AD-Y hybrid proteins leads to low expres- 
sion, which minimizes artifactual interactions driven by mass action. Use of 
high copy number vectors can increase DB-X and AD-Y protein expres- 
sion and increase the sensitivity of the assay. This comes at the cost of 
increasing the detection of unspecific interactions (Braun et ah, 2009). The 
use of high copy number vectors should be accompanied by rigorous quality 
control and validation of every individual interaction with multiple assays. 

YeQSt Strains We have used two different Y2H strain backgrounds over 
the years (Vidal et ah, 1996; Yu et ah, 2008). The protocols described are 
applicable to Y8800 and Y8930, MATa and MATot, respectively, two 
strains derived from PJ69-4 (James et ah, 1996) which harbor 
the following genotype: leu2-3,112 trpl-901 his3-200 ura3-52 gaHA 
gal80A GAL2-ADE2 LYS2::GAL1-HIS3 MET2::GAL7-lacZ cyh2 R . 
The availability of two haploid strains of opposite mating types enables 
the use of mating to efficiently combine large collections of DB-X and 
AD-Y constructs. By convention the Y8800 MAT* and Y8930 MATol 
strains are transformed with AD-Y and DB-X constructs, respectively. 

Y2H-inducible reporter genes The reporter genes GAL2-ADE2 and 
LYS2::GAL1-HIS3 are integrated into the yeast genome. Expression of the 
GAL1-HIS3 reporter gene should be tested with 1 mM3AT (3-amino-l,2,4- 
triazole, a competitive inhibitor of the HIS3 gene product). When dealing with 
DB-X autoactivators, higher 3AT concentrations can be used to circumvent 
autoactivator-dependent activity of GAL1-HIS3. Interactions identified at 
higher 3AT concentrations should be accompanied by rigorous quality control 
and validation of every individual interaction using multiple assays. 

Y2H controls Y2H-inducible reporter gene expression levels can vary 
from weak to very strong, although these levels may not reflect the actual 
affinity of protein— protein interactions as they take place in their native 
environment. To help determine which candidate clones likely represent 
genuine biophysical interactors, six controls are added systematically to the 
master plates of Y2H experiments (Walhout and Vidal, 2001). This collec- 
tion of diploid control strains contains plasmid pairs expressing DB-X and 
AD-Y hybrid proteins across a wide spectrum of interaction read-outs. For 
each control strain, a short description of plasmids and DB-X and AD-Y 
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hybrid proteins are provided in Table 12.1 and expected phenotypes are 
shown in Fig. 12.2. 

Please email "pascal_braun@dfci.harvard.edu" to request strains, 
plasmids, and controls. 

2.2. Validation of Y2H datasets to produce reliable binary 
interactome maps 

Despite the rigorous implementation of controls for identification of tech- 
nical artifacts, a fraction of technical false positives can still be recovered in 
large-scale datasets. Well-described artifacts might have escaped detection, 
or it is possible that certain classes of artifacts have not been identified yet 
and consequently no controls are available to detect and remove them. 
Therefore, the quality of any dataset must be further assessed before it can be 
used as a reliable interactome map. 

In earlier attempts at addressing this question for Y2H, protein pairs that 
activated two or more distinct reporters or pairs that were detected in two 
or more configurations (e.g., DB-X/AD-Y and DB-Y/AD-X) were 



Table 12.1 Y2H controls 





Plasmid pairs 


Protein 


Interaction strength 


Control 1 


pDEST-AD 


No insert 


None, background 




pDEST-DB 


No insert 




Control 2 


pDEST-AD-E2Fl 


Human E2F1 


Weak (control for 






aa 342-437 


CHX control 




pDEST-DB- 


Human pRB 


plates) 




CYH2-pRB 


aa 302-928 




Control 3 


pDEST-AD-Jun 


Mouse Jun 
aa 250-325 


Moderately strong 




pDEST-DB-Fos 


Rat Fos 
aa 132-211 




Control 4 


pDEST-AD 


No insert 


Very strong 




pDEST-DB-Gal4 


Yeast Gal4 
aa 1-881 




Control 5 


pDEST-AD-dE2Fl 


Drosophila E2F 
aa 225-433 


Strong 




pDEST-DB-dDP 


Drosophila DP 
aa 1—377 




Control 6 


pDEST-AD- 


Drosophila E2F 


Strong (control for 




CYH2-c!E2Fl 


aa 225-433 


CHX plates) 




pDEST-DB-dDP 


Drosophila DP 
aa 1—377 





Identities and description of expected phenotypes for the six controls used in every Y2H experiment. 
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Control no. 



Selective plates 1 



•••»•+ 



Sc-Leu-Trp 



Sc-Leu-Trp-His 
+1 mM 3AT 

Sc-Leu-His 
+1 mM 3AT + 1 mg/1 CHX 



Sc-Leu-Trp-Ade 

Sc-Leu-Ade 
+1 mg/1 CHX 



Figure 12.2 Phenotypes of Y2H controls. Six strains referred to as Control 1-6, each 
containing a different pair of DB-X and AD-Y hybrid proteins, are spotted on media 
selecting for the presence of both plasmids (top row) and, after an overnight incubation, 
replica-plated onto media selecting for Y2H-dependent reporter activation (rows 2 and 4). 
The six strains express DB-X/ AD-Y pairs that result in reporter gene activation at various 
intensities. DB-X autoactivation is tested on plates that select for the loss of the AD-Y 
plasmid (rows 3 and 5). 




considered to be of "higher quality", that is, more likely to be real biophys- 
ical interactors, than those pairs that activated only one reporter or were 
only found in a single orientation. Historically, and especially with cDNA 
screens, these criteria did indeed offer limited protection against artifacts, 
and enabled identification of more likely "true" interactions (Vidalain et ah, 
2003). Today, however, such artifacts can be removed more systematically 
and more reliably by the controls described in Sections 2.1.1 (CHX control) 
and 2.1.2 (verification). All interactions that pass these controls are consid- 
ered high-quality Y2H interactions, irrespective of whether or not they are 
detected in only one orientation or if they activate only one reporter. 

Many "true" interaction pairs activate only one Y2H reporter or are 
detected in only one configuration. This is due to effects that are unrelated 
to the interaction capacity of the two examined proteins. The genomic 
context of the different reporters or use of promoters that require different 
levels of reconstituted transcription factor can lead to differential reporter 
activation. Similarly, the use of hybrid proteins imposes steric constraints on 
proteins that can interfere with detection of many interactions in at least one 
configuration. This was shown by testing a set of well documented positive 
control interaction pairs in Y2H and four other binary interaction assays. 
Consistently, only half of the positive scoring controls were detected in 
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both orientations in any of the five assays (Braun et ah, 2009). Thus, while 
activation of multiple reporters and detection of interactions in multiple 
configurations can be comforting, these attributes are neither necessary nor 
sufficient requirements for high quality interactions. 

Various experimental methods and computational approaches have been 
described to evaluate the quality of large-scale interactome datasets. Most 
computational methods estimate the correlation between physical interac- 
tion data and secondary data, such as expression profiling or types of 
functional annotations (Bader et ah, 2004; Deane et ah, 2002). Determina- 
tion of data quality using this approach can effectively lead to filtered 
datasets that might be biased for particular classes of interactors, such as 
those with strong coexpression correlation. Such correlative data evaluation 
approaches make implicit assumptions about the nature of protein— protein 
interactions, which can potentially lead to erroneous conclusions (Yu et ah, 
2008). Interactome maps can be productively integrated with orthogonal 
datasets to gain novel insights into biology (Pujana et ah, 2007; Vidal, 2001). 
If interaction datasets have been prefiltered using orthogonal data then such 
higher level analysis becomes less informative. 

Another approach is to overlap the information from different interac- 
tion datasets to assess the FDR. In these analyses crucial details of the 
underlying experiments used in the respective screens are often ignored. 
Four critical parameters have to be taken into consideration (Venkatesan 
et ah, 2009): (i) the number and identity of ORFs used in each screen 
{completeness), (ii) the detection limitations of the assays used {assay sensitivity) 
(affected by many parameters like strains, location of protein tags, detection 
methods), (hi) the extent of incomplete sampling in each search space 
{sampling sensitivity), and (iv) the potential presence of technical false posi- 
tives (precision). Without knowledge of these parameters for each dataset, 
any conclusion about data quality based on their overlap is meaningless. 
Thus, given the inherent limitations of computational approaches for qual- 
ity control, experimental methods involving alternative protein interaction 
assays are strongly preferred. 

2.2.1. Quality control I: Experimental assessment 
of dataset precision 

One experimental approach to validate dataset quality consists in testing a 
representative sample of potential interactions from a given dataset with an 
orthogonal interaction assay. Since there is apparently not a single assay 
capable of detecting all protein— protein interactions tested, and considering 
that the subset of interaction pairs scoring positive in any two assays is rarely 
identical (Braun et ah, 2009), it is to be expected that only a fraction of 
interactions from a particular dataset will be detected by a validation assay. 
This is a consequence of the nature of interaction assays and the biochemical 
diversity of interactions, and not per se an indication of the quality of 
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the original dataset. Characterizing the validation assay of choice using a 
positive reference set (PRS) and random reference set (RRS) of well- 
documented and random protein— protein interactions, respectively (Braun 
et ah, 2009; Cusick et ah, 2009; Venkatesan et ah, 2009; Yu et ah, 2008) 
provides an estimate of the assay sensitivity and the background of the 
validation assay. If desired, the stringency of the validation assay can be 
adjusted to decrease the background (and assay sensitivity) or to increase the 
assay sensitivity (and background) . The validation assay results of the dataset 
sample relative to the PRS/RRS benchmark data enables estimation of the 
dataset precision (Braun et ah, 2009; Venkatesan et ah, 2009; Yu et ah, 2008). 

2.2.2. Quality control II: Experimental confidence scores 
for individual interactions 

When dataset precision is determined using a single assay, validation rates 
between 20% and 40% can be expected for both PRS and high-quality 
datasets under conditions in which the RRS detection rate is below 5% 
(Braun et ah, 2009). In the long term, it will be highly desirable to not only 
estimate the overall precision of a dataset but to validate all protein— protein 
interactions individually. Validation for individual interactions in a dataset 
can be made stronger if multiple complementary assays are used to test the 
interactions (Braun et ah, 2009). The concept of calibrating and bench- 
marking assay performance with the PRS/RRS can be applied to multiple 
assays and can be used to calculate a confidence score for individual bio- 
physical protein— protein interactions. Multiple interaction assays are first 
benchmarked against common PRS and RRS reference sets to obtain 
comparable calibrations of assay sensitivity and background. Then, all inter- 
actions identified in a large-scale interactome screen are characterized using 
the same assay implementations. After the results from all assays have been 
collected for any interaction pair, a confidence score can be calculated based 
on prior PRS/RRS calibration of the assays and the validation results of the 
respective interaction (Braun et ah, 2009). PRS/RRS clones for several 
organisms are available upon request (Braun et ah, 2009; Simonis et ah, 
2009; Venkatesan et ah, 2009; Yu et ah, 2008). 

2.3. Biological evaluation of binary interactome maps 

The identification of high-confidence biophysical interactions is an important 
first step towards answering many biological questions both at small and large 
scale. However, even robustly demonstrated biophysical interactions might be 
biological false positives, or pseudo-interactions, that never occur in vivo. 

Biological relevance of protein— protein interactions has been inferred 
from network analyses or by combining interactome information with 
systematic genetic data (Collins et ah, 2007a,b; Pujana et ah, 2007). Despite 
some success, these approaches remain constrained by the availability of 
high-quality datasets, and are limited as they are predictions. 
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Until demonstrated by thorough mechanistic studies of all proteins 
involved, the biological role of protein— protein interactions remains elusive. 
Such mechanistic studies are typically carried out at small scale, so this approach 
is unsustainable and cost prohibitive for characterizing hundreds of thousands 
of soon to be discovered human protein interactions (Venkatesan et ah, 2009). 

Biological relevance of a biophysical protein— protein interaction may be 
derived from an observed phenotype following genetic disruption of this 
specific interaction in vivo. Such ID As can occur naturally, as has been found 
for some inherited Mendelian diseases (De Nicolo et ah, 2009; Zhong et ah, 
2009). For these alleles the causal link between disruption of the biophysical 
interaction and the observable phenotype must be demonstrated. Alterna- 
tively, IDAs can be generated experimentally using a reverse two-hybrid 
approach (Dreze et ah, 2009; Vidal et ah, 1996). For defining a biological 
function of a biophysical interaction using such experimentally generated 
IDAs, a critical step is the identification of a phenotype and subsequent 
demonstration of causality. 

Certain interactions may have subtle or modifying roles in the regulation 
of cellular functions. Disruption of such interactions individually may lead 
to subtler, easy to overlook phenotypes. Disruption of such interactions in 
the presence of other genetic or environmental perturbations may produce 
more observable systems alterations. For those, quantitative mathematical 
modeling may be useful for analyzing small or synergistic phenotypic 
consequences. 




3. High-Throughput Y2H Pipeline 

3.1. Assembly of DB-X and AD-Y expression plasmids 

The first step towards binary interactome mapping is the generation of 
expression plasmids. For high- throughput experiments it is preferable to 
use sequence independent recombinational subcloning technologies such as 
Gateway cloning (Walhout et ah, 2000). Large resources containing 
thousands of distinct ORFs in Gateway entry vectors are available for a 
few organisms (Lamesch et ah, 2004, 2007; Reboul et ah, 2003; Rual et ah, 
2004). These ORFs can be transferred into Gateway-compatible expression 
vectors in a simple single-step reaction (Fig. 12.3). Albeit not mandatory, 
linearizing the destination vectors by restriction digestion improves recom- 
bination efficiency and decreases background as well as chances of obtaining 
incorrect LR recombination clones. The restriction enzyme should be 
chosen so that the destination vector is digested only once between the 
two Gateway recombination sites. The Gateway LR reaction, carried out 
using enzyme and buffer concentrations optimized by titration, gives best 
yields at 25 °C for 18 h but can also be carried out for ~2 h at room 
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Figure 12.3 Pipeline for preparation of Y2H reagents. The pipeline from producing 
Gateway entry clones to transformation and quality control of yeast strains used in Y2H 
screens. Protein-encoding ORFs are first transferred by Gateway LR reactions into 
pDEST-AD and pDEST-DB, and amplified in bacteria. DNA is then extracted for yeast 
transformations. After transformation DB-X hybrid proteins are tested for autoactivator 
phenotypes and then rearrayed before screening. AD-Y hybrid proteins are combined 
into minipools of 188 different clones per pool. 

temperature. Completed recombination reactions are transformed into 
Escherichia coli, grown for 18 h, and plasmids are isolated. This step can be 
done manually or by using liquid handling robots. Because all steps 
are carried out in 96-well microtiter plates, protocols are provided for the 
equivalent of one 96-well plate. 



Protocol 1: Restriction digestion of Y2H destination vectors 

1. Combine in one tube: 

— 11 fig destination vector (pDEST-AD or pDEST-DB). 

— 11 }A of 10 X restriction enzyme buffer. 

— 2.5 jA of Smal restriction enzyme (50 units). 

— 85.5 jA filter-sterilized water. 

2. Mix well by pipetting up and down several times. 

3. Incubate at 25 °C for 12—16 h. 

4. Incubate at 65 °C for 20 min to heat-inactivate the restriction enzyme. 
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Display 500 ng of digested destination vector on a 1% agarose gel 
alongside 500 ng of undigested destination vector to confirm complete 
digestion. The heat-inactivated reaction mix can be used for Gateway LR 
reactions without further purification. 



Protocol 2: High-throughput Gateway LR recombinational cloning 

1. Combine in one tube: 

— 110 fA of Smal digested destination vector (11 fig). 

— 1 10 /il of LR clonase buffer 5 X . 

— 55/ilofTE lx. 

— 55 fi\ of LR clonase enzyme mix (Invitrogen) (keep this mix on ice). 

2. Homogenize by gently pipetting up and down. 

3. With a multichannel pipette, distribute 3 fi\ of this solution into every 
well of a 96-well microtiter plate. 

4. Add 2 fA of entry clone per well. 

5. Centrifuge briefly. 

6. Incubate at 25 °C for 18 h. 



Protocol 3: Bacterial transformation 

The following protocol is used to transform, amplify, and isolate Gateway 
LR reaction products: 

1. Thaw 1 ml of competent DH5a-Tl (Invitrogen) cells on ice (with a 
transformation efficiency greater than 5x10 antibiotic resistant colo- 
nies per fig of input DNA). 

2. Add 10 fA of competent cells per well directly into a 96-well plate 
containing 5 fA Gateway LR reaction mix in each well. 

3. Seal the plate with adhesive foil. 

4. Incubate on ice for 20 min. 

5. Heat shock at 42 °C in a standard thermocycler for 1 min. 

6. Incubate on ice for 2 min. 

7. Add 100 fA of prewarmed (37 °C) SOC media per well. Seal the plate 
with adhesive foil to avoid contamination. 

8. Incubate at 37 °C for 1 h. 

9. Transfer the transformation mix into a 96-well deep-well plate 
containing 1 ml of LB media with 100 /ig/ml of ampicillin. 

10. Incubate on a 96-well plate shaker at 37 °C for 20 h. 

11. Remove 5 fA for subsequent analysis by PCR (Protocol 4). 

12. Remove 80 fA of the overnight culture, mix with 80 fA of 40% (w/v) 
autoclaved glycerol and store at — 80°C. 

13. Use the remainder of the overnight culture for plasmid isolation. 
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SOC medium 0.5% yeast extract, 2% tryptone, 10 mM NaCl, 2.5 mM 
KC1, 10 mMMgCl 2 , 10 mMMgS0 4 , 20 mM glucose. Add glucose after 
autoclaving the solution with the remaining ingredients, and let cool down. 
Sterilize the final solution by passing it through a 0.2 /im filter. SOC 
medium can be stored at room temperature. 

Transformation controls It is good practice to systematically control for 
media contamination (no cells), competent cells contamination (cells only), 
Gateway LR reaction contamination (negative control of LR reaction), and 
transformation efficiency (10 pg of pUC19). If the four controls indicate 
clean and successful transformation, proceed to the next quality control step. 

Recombination control To confirm that Gateway LR reactions occurred 
properly, analyze recombination products by bacterial culture PCR using 
destination vector specific primers (Protocol 4). For each transformation 
plate select one row for PCR. 

Protocol 4: Bacterial culture PCR 

Dilute 5 fA of bacterial culture into 95 fA of sterile water and mix by pipetting 
up and down. Keep bacterial cultures at 4 °C until PCR results are determined. 
For one 9 6- well plate of PCR, prepare in a tube on ice: 

— 330 fA of HiFi Platinum Taq polymerase buffer 10 X (Invitrogen). 

— 120 jA of 50 mMMgS0 4 (final concentration 1.8 mM). 

— 33 jA of 40 /iMdNTPs (final concentration 400 nM). 

— 3.3 fA of 200 fiM AD or DB forward primer (final concentration 
180 nM). 

— 3.3 fA of 200 /iMTerm reverse primer (final concentration 180 nM). 

— 20 fA of HiFi Platinum Taq polymerase (Invitrogen). 

— 2.5 ml of filter-sterilized water. 

Aliquot 30 fA of the reaction mix into every well of a soft shell, 
V-bottom 96-well microtiter plate. Keep on ice. Add 3 fA of the diluted 
bacterial culture per well as DNA template. Wells G12 and HI 2 are used as 
negative control (water as template) and positive control (10 ng of empty 
pDEST-AD orpDEST-DB), respectively. 

Place the PCR plate on a thermocycler and run the following program: 

Step 1: Denaturation at 94 °C for 4 min. 

Step 2: Denaturation at 94 °C for 30 s. 

Step 3: Annealing at 58 °C for 30 s. 

Step 4: Elongation at 68 °C for 3 min. 
Repeat Steps 2-3-4, 34 times. 

Step 5: Final elongation at 68 °C for 10 min. 

Step 6: Hold at 10 °C. 
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Primer sequences The primers are designed such that the S'-primer 
confers AD and DB vector specificity, respectively, whereas the Term 3 f - 
primer is identical for both vectors. 

AD: 5'-CGCGTTTGGAATCACTACAGGG-3' 

DB: 5'-GGCTTCAGTGGAGACTGATATGCCTC-3' 

Term: 5'-GGAGACTTGACCAAACCTCTGGCG-3' 

Once PCR reactions are completed, analyze 5 jA of PCR product on a 
1% agarose gel by comparing sizes to that of the control from well HI 2 (the 
PCR amplicon from a destination vector containing the Gateway cassette 
has an expected size ^1.9 kb). The H12 control serves simultaneously as a 
positive control for the PCR and as a negative control for the LR recombi- 
nation reaction. PCR failure is indicated by the absence of the HI 2 product, 
and failure of the LR reaction may be indicated by a dominant band of 
1.9 kb across all wells. Successful LR reactions will give rise to the size 
distribution of the original ORFs to which ~ 280 bp of vector sequences 
are added due to the AD, DB, and Term primer positions. If the PCR 
results indicate successful LR recombinations, prepare archival stocks by 
combining 80 fA of bacterial cultures with 80 fA of 40% (w/v) glycerol in a 
round-bottom 96-well microtiter plate. The rest of the cultures are used 
for plasmid isolation. Afterwards, ensure successful plasmid isolation by 
analyzing 2 fA of the DNA preparation on a 1% agarose gel. 

To ensure the absence of plate orientation mistakes when processing 
multiple plates, sequence verify PCR products amplified from one column 
of each 96-well miniprep plate. Use 1 fA of the DNA preparation as template 
for PCR. The primers, recipes, and PCR conditions are identical to those 
presented in Protocol 4. BLASTn of the acquired sequences against a reference 
database identifies clones and allows verification of their correct locations. 



3.2. Yeast transformation 

DB-X and AD-Y expression plasmid constructs are individually transformed 
into competent Y8930 (MATa) and Y8800 (MATzt) strains, respectively. 



Protocol 5: Yeast transformation 

This protocol requires two solutions that need to be freshly prepared from 
stock solutions in order to obtain maximum transformation efficiencies. 
Tris— ED T A— lithium acetate solution (TE/LiAc) is prepared by 10-fold 
dilution of 10 X TE and 1 M LiAc stocks to give 10 mM Tris-HCl 
(pH 8.0), 1 mMEDTA (pH 8.0), and 100 mM LiAc final concentration. 
TE/LiAc polyethyleneglycol (PEG) solution is prepared by combining 
8 volumes of 44% (w/v) PEG 3350 with 1 volume of 10 X TE and 1 volume 
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of 1 M LiAc. The following volumes and quantities are given for carrying 
out one 96-well plate of transformations. 

1. Streak Y8800 and Y8930 on separate YEPD plates and incubate at 
30 °C for 48—72 h to obtain isolated colonies. 

2. For each strain, inoculate 20 ml of YEPD with 10 isolated colonies. 
Incubate at 30 °C on a shaker for 14—18 h. 

3. Measure and record the OD 600 , which should be between 4.0 and 6.0. 
Dilute cells into YEPD media to obtain a final OD 600 = 0.1. Use 
100 ml of YEPD media per 96-well plate of transformations. 

4. Incubate at 30 °C on a shaker until OD 600 reaches 0.6—0.8 (4—6 h). 

5. Boil carrier DNA (salmon sperm DNA, Sigma-D9156) for 5 min then 
place on ice until needed. 

6. Harvest cells by centrifugation at 800 x^ for 5 min. Discard the super- 
natant and resuspend cells gently in 10 ml of sterile water. 

7. Centrifuge as described in step 6 and discard the supernatant. 

8. Resuspend cells in 10 ml of TE/LiAc solution, centrifuge, and discard 
the supernatant. 

9. Resuspend cells in 2 ml of TE/LiAc solution, then add 10 ml of TE/ 
LiAc/PEG solution supplemented with 200 jA of boiled carrier DNA. 
Mix the solution by inversion. 

10. Dispense 120 jA of this mix into each well of a round-bottom 96-well 
micro titer plate. 

11. Add 10 jA of plasmid DNA to the competent yeast and mix by 
pipetting up and down several times. Use liquid handling robots to 
transfer and mix 96 samples at a time. Seal the plate with adhesive foil. 

12. Incubate at 30 °C for 30 min. 

13. Subject to heat shock in a 42 °C water bath for 15 min. 

14. Centrifuge the 96-well plate for 5 min at 800 Xg. Carefully remove the 
supernatant using a multichannel pipette. 

15. To each well add 100 jA of sterile water and resuspend cell pellets by 
pipetting up and down. 

16. Centrifuge the 96-well plate for 5 min at 800 Xg, then carefully remove 
90 /A of water from each well using a multichannel pipette. 

17. Resuspend cell pellets by vortexing the 96-well plate on a shaker. 

18. Spot 5 /il/well of cell suspension onto an appropriate selective plate 
(Sc-Trp for AD-Y, Sc-Leu for DB-X). For a consistent footprint, use 
of a liquid handling robot is recommended. 

19. Incubate at 30 °C for 72 h. 

20. Using sterile flat-end toothpicks, pick transformed yeast colonies into 
individual wells of a 96-well round-bottom plate containing 160 jA of 
selective media (Sc-Trp for AD-Y, Sc-Leu for DB-X). 

21. Incubate on a shaker at 30 °C for 72 h. 
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22. Prepare archival stocks by combining 80 fA of the yeast culture with 
80 fA of 40% (w/v) autoclaved glycerol in a round-bottom 96-well 
plate. Store at —80 °C. 



3.3. Autoactivator removal and AD-Y pooling 
3.3.1. Autoactivator identification and removal 

To be as close as possible to the physiology of the cells in which interactions are 
detected, the identification of autoactivators is achieved in diploid yeast strains 
obtained by mating DB-X yeast strains with the Y8800 yeast strain trans- 
formed with the AD encoding plasmid containing no insert (empty pDEST- 
AD). All diploid yeast strains showing a growth phenotype stronger than the 
"no interaction" Y2H control (control 1) are considered autoactivators. 
Because activation of the GAL1::HIS3 reporter gene is easier to achieve 
than that of GAL7::ADE2, it is used for autoactivator identification. 



Protocol 6: Identification of autoactivating DB-X hybrid proteins 

Before starting the experiment, prepare one YEPD plate, one Sc-Leu-Trp 
plate and one Sc-Leu-Trp-His + 1 mM 3AT plate for each 96-well plate of 
DB-X yeast strains to be tested. The YEPD plates should be prepared at least 
1 week in advance to allow them to dry. This allows fast penetration of liquid 
in the mating step and prevents merging of adjacent spots due to excess liquid. 

1. Add 160 fA of fresh liquid Sc-Leu media to each well of a round- 
bottom 96-well microtiter plate followed by 5 fA from individual 
glycerol stocks of each DB-X yeast strain to each well. 

2. For every plate of DB-X yeast strains to be tested, inoculate a test tube 
containing 0.55 ml of Sc-Trp media with Y8800 transformed with 
empty pDEST-AD. 

3. Incubate at 30 °C for 72 h on a shaker. 

4. Spot 5 fA of DB-X liquid cultures on a YEPD plate using a liquid 
handling robot. 

5. Allow the spots to dry for 30—60 min. 

6. Aliquot the pDEST-AD transformed Y8800 yeast culture into a 
round-bottom 96-well plate. 

7. Spot 5 fA of pDEST-AD transformed Y8800 on top of the DB-X spots. 

8. Spot Y2H controls at the bottom of the plate. 

9. Incubate mating plates at 30 °C for 14—18 h. 

10. Replica-plate onto Sc-Leu-Trp media to select for diploid cells. 

11. Incubate at 30 °C for 14-18 h. 

12. Replica-plate from the Sc-Leu-Trp media onto Sc-Leu-Trp-His + 1 mM 
3AT media. Nonautoactivating yeast cells are not able to activate the 
GAL1::HIS3 reporter gene hence should not grow on this media. 
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13. Incubate at 30 °C for 14-18 h. 

14. "Replica-clean" Sc-Leu-Trp-His + 1 mM 3 AT plates by placing each 
plate on a piece of velvet stretched over a replica-plating block and 
pushing evenly on the plate to remove excess yeast. Replace the cloth 
and move to process the next plate until all plates have been cleaned. 

15. Incubate at 30 °C for 72 h. 

16. Score growth pheno types. 

Growth phenotypes are scored by comparison to the "no interaction" 
Y2H control (control 1). All yeast strains showing a stronger growth 
phenotype than control 1 are considered autoactivators. To reliably identify 
autoactivators it is best to score growth twice independently. If a yeast clone 
is given two different scores, accepting the most stringent one ensures high 
quality of the starting material for subsequent interactome mapping. 

Autoactivators are physically removed from the collection of DB-X 
transformed yeast clones by robotic rearraying of nonautoactivator yeast 
clones into new plates. During the rearray step plate positions G12 and HI 2 
are left empty for control purposes. New glycerol stocks are prepared from 
this consolidated collection of nonautoactivating yeast strains and used for 
subsequent Y2H screens. 

1 . From archival glycerol stocks containing all of the individual DB-X yeast 
clones, cherry pick nonautoactivating DB-X yeast clones into plates 
containing 160 jA Sc-Leu (DB-X) liquid media. 

2. Incubate at 30 °C for 72 h on a 96-well plate shaker. 

3. Prepare an archival stock by combining 80 jA of the yeast culture with 
80 jA of 40% (w/v) autoclaved glycerol in a round-bottom 96-well plate. 

Albeit much less frequent, autoactivating AD-Y can also occur. The 
previous protocol can easily be adapted for AD-Y autoactivator identifica- 
tion by use of AD specific reagents wherever appropriate. As an alternative, 
to reduce time and cost, it is possible to test AD-Y autoactivation using 
pools. For this, each AD pool, described in Protocol 7, is mated with Y8930 
transformed with the DB encoding plasmid containing no insert (empty 
pDEST-DB) then processed (Protocol 6). If a diploid strain shows growth 
on autoactivator detection plates, the responsible AD-Y yeast clone can be 
identified by deconvoluting the AD-Y pool. This step is achieved by testing 
all 188 AD-Y yeast clones constituting the pool for autoactivation. Once 
identified, the autoactivating AD-Y yeast clone is removed and the affected 
pool is reassembled without it (Protocol 7). 

3.3.2. Efficient screening by AD pooling 

The pools used in the Y2H pipeline combine 188 different AD-Y exp- 
ressing yeast clones. This experimentally defined pool-size provides an 
optimal compromise between screening efficiency (number of plates to be 
processed) and screen sensitivity (number of interactors identified). 
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Protocol 7: Construction of AD pools 

This protocol describes the construction of one pool of 188 different AD-Y 
hybrid constructs transformed into Y8800, starting from two 96-well plates 
of AD-Y yeast strains. 

1. For each of the two plates of 94 AD-Y constructs: inoculate 500 fA per 
well of Sc-Trp media with 5 fA per well of AD-Y yeast strains. 

2. Grow on a shaker at 30 °C for 4 days. 

3 . Resuspend yeast cell cultures by thoroughly vortexing the culture plates. 

4. Measure the OD 600 to ensure that growth is homogenous throughout 
each plate, hence that each AD-Y yeast strains will be represented in the 
same proportion. 

5. Transfer the contents of the two culture plates into a sterile trough. 

6. Mix thoroughly to ensure equal representation of all AD-Y yeast strains 
in the pool. 

7. On a liquid handling platform, prepare archival stocks by combining 
80 fA of the pooled yeast cultures with 80 fA of 40% (w/v) autoclaved 
glycerol in round-bottom 96-well microtiter plates. 

If additional copies of the AD-Y pools are required, these should be 
made according to the protocol above and not by amplification of existing 
pools, as amplification can lead to loss of representation within the pool. 



Protocol 8: Assessing equal representation of AD-Y clones in pools 

Before the AD pools are used for Y2H experiments, equal representation of 
each of the 188 AD-Y clones in the pools should be confirmed. Biased pools 
and low representation of some AD-Y yeast cells will decrease if not 
eliminate the ability to detect protein interactions involving the underrep- 
resented hybrid proteins. 

1. For each plate of AD-Y pools streak 5 fA of glycerol stock from two 
randomly selected wells onto Sc-Trp plates. 

2. Grow at 30 °C for 72 h. 

3. From each Sc-Trp plate, pick 96 isolated colonies and lyse yeast cells 
according to Protocol 9. 

4. Add 3 fA of the yeast cell lysate as PCR template. 

5. Carry out PCR according to Protocol 10. 

6. Run 5 fA of PCR product on a 1% agarose gel. If PCR products can be 
detected in most reactions, proceed to analyze the corresponding plate 
by end-read sequencing. 

7. Identify the obtained sequences by BLASTn. If the created pools are not 
biased, the frequency at which yeast cells containing identical AD-Y 
plasmids were picked (and hence the sequence identifications) should 
follow a normal distribution. 
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Protocol 9: Yeast cell lysis 

1. Prepare lysis buffer by dissolving 2.5 mg/ml zymolase 20T (21,100 U/g, 
Seikagaku Corp.) in 0.1 M sodium phosphate buffer (pH 7.4). Keep on ice. 

2. Aliquot 15 fA of lysis buffer into the wells of a 96-well PCR plate. Keep 
on ice. 

3. Pick a small amount of yeast cells (not more than fits on the very end of a 
standard 200 fA tip) and resuspend in the lysis buffer in the PCR plate. 

4. Place the PCR plate on a thermocycler and run the following program: 
Step 1: 37 °C for 15 min 

Step 2: 95 °C for 5 min 
Step 3: Hold at 10 °C 

5. Add 100 fA of filter-sterilized water to each well. 

6. Centrifuge 10 min at 800 x^. 

7. Store at -20 °C. 



Protocol 10: Yeast lysate PCR 

For each 96-well plate of PCR reactions, prepare the following reaction 
mix on ice: 

— 330 fA of HiFi Platinum Taq (Invitrogen) polymerase buffer 10 X. 

— 120 fA of 50 mMMgS0 4 (final concentration 1.8 mM). 

— 33 fA of 40 /iMdNTPs (final concentration 400 nM). 

— 3.3 fA of 200 fiM AD primer (final concentration 180 nM). 

— 3.3 fA of 200 /iM Term primer (final concentration 180 nM). 

— 20 fA of HiFi Platinum Taq polymerase (Invitrogen). 

— 2.5 ml of filter-sterilized water. 

Aliquot 30 fA into every well of a 96-well PCR plate. Keep on ice. To 
each well, add 3 fA of the yeast cell lysate (Protocol 9) as DNA template. Seal 
plate with adhesive aluminum foil. 

Place the PCR plate on a thermocycler and run the following program: 

Step 1: Denaturation at 94 °C for 4 min. 
Step 2: Denaturation at 94 °C for 30 s. 
Step 3: Annealing at 58 °C for 30 s. 
Step 4: Elongation at 68 °C for 3 min. 

Repeat Step 2-3-4, 34 times. 
Step 5: 68 °C for 10 min. 
Step 6: Hold at 10 °C. 



3.4. Screening and phenotyping 

The Y2H pipeline consists of three essential stages, which together yield 
highly reliable interactions: primary screening, secondary phenotyping, and 
verification (Fig. 12.4). The high-throughput Y2H pipeline presented here 
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Figure 12.4 Y2H screening pipeline. Three steps, primary screening, phenotyping, 
and retesting, ensure high-throughput and reliable removal of artifacts. For primary 
screens, 94 distinct DB-X constructs are mated against a minipool containing 188 AD-Y 
hybrids. Positive colonies are picked from selective plates and in "secondary phenotyp- 
ing" are evaluated on two types of selective plates and respective autoactivation control 
plates. Protein pairs considered as "candidate Y2H interactions" are identified by DNA 
sequencing of PCR products amplified from positive colonies. All identified pairs are 
verified using fresh archival yeast stocks. DB-X/ AD-Y pairs that score positive on at 
least three out of four independent plate sets are considered high-quality Y2H interac- 
tions (see text for details) . 
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has been used to produce several high-quality proteome-scale binary inter- 
actome maps (Rual et ah, 2005; Simonis et ah, 2009; Yu et ah, 2008). 

Protocol 11: Y2H primary screening 

Pour all required agar plates at least 1 week before starting the experiments 
and store them without wrapping at room temperature. Storage ensures that 
the plates are sufficiently dry, which in turn will prevent merging of spotted 
yeast cultures in the mating step which can otherwise occur due to excess 
liquid and slow absorption into the agar. 
[Day 0: Inoculation] 

1. Thaw glycerol stocks of the DB-X yeast strains and AD pools to be 
tested. One person can easily handle a batch of 100 mating plates, 
for example, 10 96-well plates of DB-X yeast clones tested against 10 
96-well plates with AD-Y pools. 

2. Inoculate 96-well plates that contain 160 fA selective media in every well 
(Sc-Leu for DB plates, Sc-Trp media for AD pool plates), with 5 /il/well 
of the thawed glycerol stock plates. 

3. Seal all plates with adhesive tape and return glycerol stocks to — 80°C. 

4. Incubate the inoculated cultures at 30 °C on a shaker for 72 h. 

[Day 3: Mating] 

1 . For each combination [AD-Y pool plate X 96 DB-X plate] spot 5 /il/well 
of the respective AD-Y pool liquid culture onto a mating plate (YEPD) 
using a liquid handling robot. 

2. Allow spots to dry for 30—60 min. 

3. Spot 5 /il/well of each DB-X on top of the AD pool spots. 

4. Spot Y2H controls onto every plate. 

5. Incubate mating plates at 30 °C for 14—18 h. 

[Day 4: Replica-plating] 

1 . Replica-plate mated yeast cells from mating plates onto screening plates 
(Sc-Leu-Trp-His + 1 mM3AT). 

2. To detect de novo autoactivators, for each distinct plate of DB-X yeast 
clones, replica-plate yeast from three mating plates (with three different 
AD pools) onto Sc-Leu-His + 1 mM 3AT + 1 mg/1 CHX plates. 

3. Incubate at 30 °C for 14-18 h. 

[Day 5: Replica-clean] 

1. Replica-clean all plates by placing each plate on a piece of velvet 
stretched over a replica-plating block and pushing evenly on the plate 
to remove excess yeast cells. Replace the cloth and move to process the 
next plate until all plates have been cleaned. The plates need to be 
cleaned enough to reduce background, but excessive cleaning can also 
lead to accidental removal of positives. 
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2. Incubate at 30 °C for 5 days. 

[Day 10: Score and pick colonies] 

Pick primary positive colonies from screening plates and resuspend in a 96- 
well plate containing liquid media (Sc-Leu-Trp). Only consider colonies that 
grew better than background as indicated by control 1 of the six Y2H controls 
(Fig. 12.2). Only pick primary positives where the corresponding spots on the 
CHX plates are negative. Consider all three CHX plates as controls. Since 
every individual DB-X construct is mated against a pool of 188 AD-Y con- 
structs, it is possible to obtain multiple interactions per spot. To account for this 
infrequent yet possible event we pick at most three colonies per spot. 

1. Pick positive yeast colonies into a 9 6- well plate containing 160 /il/well 
Sc-Leu-Trp media. Leave positions G12 and HI 2 empty for subsequent 
controls. 

2. Incubate the culture plate at 30 °C for 72 h. 

3. The cultures can be used directly for phenotyping (Protocol 12 — start at 
Step 2). It is also recommended to prepare an archival glycerol stock by 
combining 80 jA of the yeast culture with 80 jA of 40% (w/v) autoclaved 
glycerol in a 9 6- well plate, sealing the plates with adhesive tape and 
storing at — 80° C. 



Protocol 12: Phenotyping 

[Day 0: Inoculation] 

1. Thaw glycerol stocks of primary positives. 

2. Spot 5 /il/well onto Sc-Leu-Trp plates using a 96-well liquid handling 
robot. 

3. Seal all glycerol stock plates with adhesive tape and return to — 80° C. 

4. Add Y2H controls. 

5. Incubate the Sc-Leu-Trp plates at 30 °C for 48 h. 

[Day 2: Replica-plating] 

1. Replica-plate from Sc-Leu-Trp plates onto four phenotyping plates: 

- Sc-Leu-Trp-His + 1 mM 3AT 

- Sc-Leu-Trp-Ade 

- Sc-Leu-His + 1 mM 3AT + CHX (1 mg/1) 

- Sc-Leu-Ade + CHX (1 mg/1) 

The first two plates are used to assess Y2H reporter activity; the two 
CHX plates enable detection of autoactivators. 

2. Clean the plates immediately after replica-plating. This step will mini- 
mize background growth. 

3. Incubate the phenotyping plates at 30 °C for 72 h. 
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[Day 5: Scoring] 

1 . Identify autoactivators by inspecting CHX plates. Any yeast spot showing 
growth on these plates should not be considered for further processing. 

2 . Identify candidate interactions (secondary positives) . It is useful to differen- 
tiate positives activating one reporter gene (most often GAL1::HIS3) from 
those activating both reporter genes. An example of a Sc-Leu-Trp plate and 
the four assay plates along with proper scoring are shown (Fig. 12.5). 

3. Patch all secondary positives on fresh Sc-Leu-Trp plates. 

4. Incubate the Sc-Leu-Trp plates at 30 °C for 48 h. 

5. Lyse cells according to Protocol 9. 

6. Amplify the inserts of the DB-X and AD-Y inserts of positive colonies 
by yeast colony PCR according to Protocol 10 for subsequent ORF 
identification by end-read sequencing. At this stage the matched PCR 
products coding for putatively interacting proteins are physically sepa- 
rated. It is critical to track the matching AD-Y and DB-X PCR products 
so that interacting pairs can be identified after sequencing. 

Once sequencing data have been received and the candidate protein 
pairs have been identified, a list of unique candidate interaction pairs can be 
compiled. 



3.5. Verification 

Protocol 13: Verification of candidate Y2H interaction pairs 

While the CHX control at every step identifies spontaneous autoactivators 
arising from mutations in DB-X, this last verification step protects against 
other potential artifacts, for example, from mutations elsewhere in the yeast 
genome, and ensures robust high data quality. To reach maximum repro- 
ducibility, robustness, and reliability of Y2H interactions, this critical step is 
carried out a total of four times independently (16 plates corresponding to 
four sets of four assay plates), ideally by four different experimenters. Only 
interactions that score positive at least three out of four plate sets, and do not 
once score as autoactivators, are accepted as verified Y2H interactions. 

Before the verification experiment can be done, it is necessary to rearray 
yeast clones corresponding to candidate Y2H interacting pairs into new 
plates. During the rearray step, plate positions G12 and H12 should be left 
empty for subsequent controls. 

1. From archival glycerol stocks of the individual AD and DB transformed 
yeast clones, rearray the (candidate) interaction partner clones 
into matching positions of plates containing 160 jA Sc-Trp (AD-Y) 
and Sc-Leu (DB-X) liquid media. 

2. Incubate at 30 °C on a 96-well plate shaker for 72 h. 

3. Prepare an archival stock by combining 80 jA of the yeast culture with 
80 jA of 40% (v/v) autoclaved glycerol in a round-bottom 96-well plate. 
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Figure 12.5 Phenotyping plates and scoring. First, autoactivators are identified and 
crossed out. The stringency of autoactivator detection is high such that even slight 
growth on the CHX control plates leads to elimination of the respective candidate. 
Subsequently, growth is evaluated on the selective -His and -Ade plates using the six 
controls (Fig. 12.1) as reference. 



[Day 0: Inoculation] 

1. Thaw glycerol stocks of rearrayed Y2H candidate pairs completely. 

2. With 5 fA of glycerol stock, inoculate 160 jA of fresh Sc-Leu (DB-X) 
and Sc-Trp (AD-Y) liquid media dispensed in round-bottom 96-well 
culture plates. 

3. Incubate at 30 °C for 72 h. 
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[Day 3: Mating] 

1. Dispense 5 /il/well of AD-Y liquid culture onto a YEPD mating plate. 

2. From the matching DB-X plate, dispense 5 /il/well of DB-X on top of 
the AD-Y spots. 

3. Add Y2H controls. 

4. Incubate at 30 °C for 14-18 h. 

[Day 4: Selection of diploids] 

1 . Replica-plate mated yeast cells onto Sc-Leu-Trp diploid selection plates. 

2. Incubate at 30 °C for 14-18 h. 

[Day 5: Pheno typing of diploids] 

1. Replica-plate diploid yeast cells onto the four phenotyping plates and 
autoactivator identification plates. 

2. Immediately after, replica-clean all plates thoroughly by placing each 
plate on a piece of velvet stretched over a replica-plating block and 
pushing evenly on the plate to remove excess yeast. Replace the cloth 
and move to process the next plate until all plates have been cleaned. 

3. Incubate at 30 °C for 3 days. 

[Day 10: Scoring] 

The scoring of each of the four plate sets is done independently in the same 
way as for secondary phenotyping. We consider as verified only those Y2H 
pairs that scored positive in at least three out of four plate sets and are never 
scored as an autoactivator. 



3.6. Media and plates 

3.6.1. Nonselective rich yeast medium (YEPD) 

The Y8800 and Y8930 yeast strains are propagated on solid agar YEPD 
plates or in liquid YEPD medium. 

YEPD media 

1. Mix 20 g of yeast extract, 40 g of bacto-peptone and 1900 ml of water. 

2. Autoclave for 45 min. 

3. Store at room temperature. 

4. Before use add 50 ml of 40% (w/v) autoclaved glucose and 15 ml of 
65 mM adenine solution per liter of media. 

YEPD agar plates 

1 . Mix 20 g of yeast extract and 40 g of bacto-peptone with 950 ml of water 
in a 2 1 flask. 
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2. Add a stir bar. 

3. Mix 40 g of agar and 950 ml of water in a second 2 1 flask and shake well. 

4. Autoclave the two flasks for 45 min. 

5. Transfer the contents of each agar flask to one media flask and mix well. 

6. Cool to 55 °C and keep in a water bath until ready to pour. 

7. Before pouring, add 100 ml of autoclaved 40% (w/v) glucose and 30 ml 
of 65 mM adenine solution. 

8. Pour 15 cm agar plates. 

9. Dry for 5—7 days at room temperature and store at room temperature. 
If the plates need to be used earlier, they can be dried for 30 min in a 
sterile hood with the ventilation on. 



3.6.2. Selective yeast media 

Selective media are used for maintaining the AD-Y and DB-X plasmids and 
detection of reporter activity. Prototrophic markers are used for selection 
on plates lacking the appropriate amino acid or nucleotide. In our system 
the DB-expressing plasmid contains the selectable marker LEU2 which 
enables growth of the Y8800/Y8930 yeast strains on plates lacking leucine 
(-Leu), while the AD-expressing plasmid contains the TRP1 marker which 
enables growth on plates lacking tryptophan (-Trp). The other two proto- 
trophic markers (HIS3 and ADE2) are used as reporter genes in our 
experiments. Expression of these markers is selected on plates lacking 
histidine (-His) (supplemented with 1 mM 3-amino-l,2,4-triazole, 3AT) 
or lacking adenine (-Ade) . Autoactivator detection plates are supplemented 
with 1 mg/1 of CHX and contain tryptophan to allow growth of yeast cells 
without the AD-Y plasmid. 

Synthetic complete (Sc) media The different selective media are based on 
the same Sc drop-out media recipe, but then supplemented with different 
amino acids to prepare the media appropriate for the various applications. 

• Sc media 

1. Mix 5.2 g of amino acid powder lacking leucine, tryptophan, histidine, 
and adenine, 6.8 g of yeast nitrogen base (without ammonium sulfate 
and amino acids), and 20 g of ammonium sulfate. 

2. Dissolve in 1900 ml water and add a stir bar. 

3. Adjust the pH to 5.9 by adding a few drops of 10 M NaOH. 

4. Autoclave the flasks for 45 min. 

5. Add 8 ml of each stock solution as needed. Store at room temperature. 

• Sc agar plates 

For a 4 1 preparation of 15 cm agar Petri plates containing Sc medium 
lacking particular amino acids or nucleotides: 
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1 . Place a magnetic stir bar into two 2 1 flasks and label as the "media flasks." 

2. Mix 5.2 g of amino acid powder lacking leucine, tryptophan, histidine, 
and adenine, 6.8 g of yeast nitrogen base (without ammonium sulfate 
and amino acids), and 20 g of ammonium sulfate. 

3. Dissolve in 1900 ml water and add a stir bar. 

4. Adjust the pH to 5.9 by adding a few drops of 10 M NaOH. 

5. Add 40 g of agar and 900 ml of water to two 2 1 flasks. 

6. Autoclave the four flasks for 45 min. 

7. Transfer the contents of each agar flask to one media flask and mix well. 

8. Cool to 55 °C and keep in a water bath until ready to pour. 

9. Add 100 ml of autoclaved 40% glucose (w/v). 

10. Add the required concentrated stock solutions, and 3 AT or CHX as 
needed. 

11. Pour approximately 100 ml in 15-cm sterile Petri plates. 

12. Dry for 5—7 days at room temperature then store indefinitely at 4 °C. If 
the plates need to be used earlier, they can be dried for 30 min in a 
sterile hood with ventilation on. 

Amino acid powder mix and stock solutions All amino acids that are 
never used as prototrophic markers are combined in a amino acid mix 
that is added to all Sc plates. 

To prepare the amino acid powders: 

1 . Mix 6 g of each of the following amino acids: alanine, arginine, aspartic acid, 
asparagine, cysteine, glutamic acid, glutamine, glycine, isoleucine, lysine, 
methionine, phenylalanine, proline, serine, threonine, tyrosine, and valine. 

2. For the amino acid powder containing adenine, add 6 g of adenine sulfate. 

Tryptophan, histidine, leucine, uracil, and adenine are omitted so they 
can be added to batches of plates as needed. The concentrated stock 
solutions are used at 8 ml/1 of media, except for adenine which is used at 
15 ml/1 of media. The different stock solutions are prepared at the following 
concentrations: 100 mM histidine— HC1 (store light protected), 100 mM 
leucine, 65 mM adenine sulfate, and 40 mM tryptophan. These stock 
solutions are stored at room temperature, except for tryptophan, which 
should be stored in the dark at 4 °C. 




4. Validation Using Orthogonal Binary 
Interaction Assays 

Complementary assays are essential to assess the precision of a dataset 
against PRS and PJ^S (see 2.2.1.). The following complementary assays can be 
used to determine the precision of a dataset by testing a random sample, and as 
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part of an interaction assay tool-kit for confidence scoring of individual inter- 
actions (Braun et al., 2009). We describe the yellow fluorescent protein (YFP) 
based protein complementation assay (Nyfeler et al. , 2005) and the sandwich 
ELISA-like well-NAPPA protein interaction assay (Braun et ah, 2009). All 
expression constructs for these methods can be assembled using Gateway 
recombinational cloning or other high-throughput cloning methods. 



Protocol 14: Yellow fluorescent protein complementation assay 
(YFP-PCA) 

In YFP— PCA, two nonfluorescent fragments ofYFP (Fl and F2) are genetically 
attached to ORFs coding for the two proteins that are to be tested in this assay. If 
the two proteins interact functional YFP can be reconstituted and detected by 
fluorescence-activated cell sorting (FACS). In this protocol a cyan fluorescent 
protein (CFP) coding plasmid is cotransfected as a transfection control. 
[Day 0: Seed cells, measure DNA] 

1 . In a 96-well tissue culture plate, seed CHO-K1 cells at 6 X 10 cells/well in 
100 jA Ham's F12 media + 10% fetal calf serum. After 24 h, confluence 
should reach 70%. 

2. Determine the concentration of the expression plasmids with PicoGreen 
assay (Invitrogen) or related assay. 

[Day 1: Transfection] 

1. Replace growth media on cells with 100 /A Opti-MEM media (Invitro- 
gen) equilibrated to 37 °C. 

2. Combine 30 ng of each PCA construct with 140 ng CFP plasmid for a 
total of 200 ng DNA in 25 /A Opti-MEM media per well to obtain the 
DNA mix. 

3. Combine 0.5 jA Lipofectamine 2000 reagent (Invitrogen) with 25 jA 
Opti-MEM media per well to obtain the transfection reagent mix. 

4. Incubate 5—25 min at room temperature. 

5. Combine the DNA and transfection reagent mixes to yield 50 /A 
transfection mix. 

6. Incubate for at least 20 min (not longer than 6 h). 

7. Add transfection mix to the cells. 

8. Incubate for 18 h. 

[Day 3: FACS Analysis] 

1. Wash cells three times gently with phosphate-buffered saline (PBS). 

2. Add 20 /A trypsin. 

3. Incubate ~ 10 min at room temperature until cells are detached. 

4. Resuspend in 100 jA PBS. 

5. Analyze cells by FACS. 
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Count a minimum of 10,000 events. Gate for CFP positive cells and 
analyze YFP fluorescence only in this subpopulation. Discard any result that 
is supported by less than 200 cells or if the CFP transfection rate is unac- 
ceptably low (<5%). On every FACS instrument the voltages and gates 
need to be calibrated using YFP and CFP controls. The best criteria for 
scoring positive interactions should be identified using a large enough set of 
controls (at least one plate worth of each PRS and RRS). After such a 
calibration, score a pair positive if at least 30% of CFP positive cells are YFP 
positive and if the average YFP signal is above background and if the 
YFP/ CFP ratio was at least twice as high as the ratio of the average YFP 
signal over the average CFP ratio on that plate. Calibrate gating of the 
instrument by using full-length YFP and CFP constructs. Scoring para- 
meters must be recalibrated on PRS/RRS data for each implementation. 



Protocol 15: Well-nucleic acid programmable protein array 
(wNAPPA) 

In well-NAPPA, the two proteins are genetically fused to a glutathione- S- 
transferase tag and to an HA epitope tag respectively and expressed in a 
coupled transcription/ translation reticulocyte lysate. The GST-tagged pro- 
tein (GST-X) is captured using an anti-GST antibody that is immobilized at 
the bottom of a 96-well microtiter plate. If the two proteins are interacting, 
this interaction can be detected with an anti-HA antibody. Like all assays, 
this biochemical pull-down assay from in vitro coupled transcription- 
translation needs to be calibrated against PRS and RRS datasets to evaluate 
performance. 

[Day 0: Blocking] 

1. Add 200 /il/well blocking buffer (5% (w/v) fat-free dry milk powder 
dissolved in PBS prepared according to standard protocols) to a microti- 
ter plated coated with rabbit anti-GST antibody (GST 96-well 
Detection Module, GE Healthcare). 

2. Block at 4 °C for 14-24 h. 

[Day 1 : wNAPPA assay] 

1. Determine the DNA concentration of expression plasmids using 
PicoGreen or a similar assay. 

2. Add 0.5—1 fig of each of the two plasmids to complete reticulocyte 
lysate reaction mix (25 /A) (TnT Coupled Transcription/Translation 
System, Promega). 

3. Incubate for 1.5 h at 30 °C on a shaker. 

4. Dilute the reaction mix with 100 ^1/well blocking solution. 

5. Transfer the diluted reaction mix to the prepared anti-GST coated 
plate. 
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6. Incubate at 15 °C on a shaker for 2 h. 

7. Discard reaction mix and wash three times with 200 fA blocking buffer 
for 5 min. 

8. Add 150 fA anti-HA monoclonal antibody (Cell Signaling Technolo- 
gies) 1:5000 in blocking buffer. 

9. Wash three times with 150 fA with blocking buffer for 5 min each. 

10. Add horseradish peroxidase (HRP) coupled goat anti-mouse antibody 
(Amersham) 1:1000—1:2000 in blocking buffer. 

11. Wash three times with 150 fA PBS for 5 min. 

12. Develop with 100 jA enhanced chemiluminescence (ECL) reagent 
like Pierce PicoWest ECL reagent. Alternatively a colorimetric HRP 
substrate will give similar results. 

13. Chemiluminescence is measured with a Biorad molecular imager gel 
doc system, but measurement could also be done with a 96-well plate 
spectrophotometer reader. 




5. Conclusion 

Information on interactome networks constitutes a critical element of 
systems biology. We have spelled out a general approach to high-quality 
interactome mapping in which a reliable high-throughput assay is used as a 
primary screening platform. Subsequently, alternative validation assays are 
used to demonstrate data quality in a way unprejudiced by preconceived 
ideas and biases about what protein interactions are supposed to look like. 
To produce high-quality data, appropriate controls need to be implemented 
at every stage of a binary interactome mapping pipeline, including thorough 
controls for technical artifacts and subsequent experimental determination 
of the quality of interactome network maps. Experimental validation of 
primary screening data ensures data quality unbiased by current scientific 
perceptions and hence of greatest utility for exploring biology. 

Use of this general framework of interactome mapping, the main fea- 
tures of which are stringent removal of technical artifacts and experimental 
control of data quality, will enable production of high-quality datasets. 
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Abstract 

Systems biology at the molecular level is concerned with networks of interact- 
ing molecules, their structure, and dynamic response to perturbations that give 
rise to systems' properties that determine measurable, macroscopic pheno- 
types. At any time, in any cell, multiple types of molecular networks are 
concurrently active. 

One of the most important known regulatory systems in eukaryotic cells is 
reversible protein phosphorylation catalyzed by protein kinases and phospha- 
tases, respectively. Therefore, it is essential to understand and eventually 
model the protein phosphorylation-mediated informational fluxes in cells 
from sensors and signaling systems to effector molecules, to comprehensively 
analyze the dynamic system of kinases/phosphatases and their substrates and 
to determine the basic rules of information processing in cells. In this chapter, 
we describe the protocols necessary to comprehensively and quantitatively 
measure the phosphorylation-modulated informational networks in cells. 

* Institute of Molecular Systems Biology, ETH Zurich, Zurich, Switzerland 
Institute for Systems Biology, Seattle, Washington, USA 

* Faculty of Science, University of Zurich, Zurich, Switzerland 

Current address: Department of Microbiology and Immunology, Stanford University Stanford, California, USA 

Methods in Enzymology , Volume 470 © 2010 Elsevier Inc. 

ISSN 0076-6879, DOI: 10.1016/S0076-6879(10)70013-6 All rights reserved. 

317 



318 Bernd Bodenmiller and Ruedi Aebersold 

The pipeline relies on the selective, quantitative isolation of phosphopeptides 
generated by the tryptic digestion of complex protein mixtures and their 
subsequent mass spectrometric and computational analysis. 

We believe that the protocols and data processing tools described in this 
chapter will be a valuable resource for biologists interested in the analysis of 
protein phosphorylation-based signal transduction. 




1. Introduction 

Reversible phosphorylation of proteins, carried out by kinases and 
phosphatases, constitutes one of the most important regulatory mechanisms 
in eukaryotic cells. Kinases, phosphatases, and their substrates form a net- 
work that controls and processes the flow of information, from sensors via 
signaling relays to effector molecules, thereby regulating processes like 
cellular growth, cell division, or apoptosis (Hunter, 2000). 

In yeast over 150 kinases and phosphatases are currently known, but our 
knowledge of their cellular roles, and especially their substrates, is still very 
sparse (Hunter and Plowman, 1997; Ptacek et ah, 2005). This is illustrated 
by the fact that only for several hundreds of the ~2000 phosphoproteins, 
with their over 10,000 phosphorylation sites, the upstream kinases or 
phosphatases are known in vivo (Ptacek et ah, 2005). 

To thoroughly comprehend cellular processes and their adaptation to 
stimuli, it is of crucial importance to investigate the connections between 
kinases, phosphatases, and their in vivo responders (proteins that change 
their phosphorylation status based on the activity of a given kinase or 
phosphatase). The elucidation of these connections constitutes one of the 
major biological questions which, if addressed, would open new avenues for 
biological and medical research (Hunter, 2000; Tan et ah, 2009). 

Over the last decade, several phosphoproteomic techniques that allow 
elucidating these connections have emerged. They are based on the repro- 
ducible and highly specific isolation of phosphopeptides from the tryptic 
digests of the proteome, their analyses using liquid chromatography— tandem 
mass spectrometry (LC— MS/MS), and the subsequent analysis of such data 
using computational tools (Aebersold and Goodlett, 2001; Aebersold and 
Mann, 2003; Gruhler et ah, 2005; Olsen et ah, 2006; Tao et ah, 2005; Zhou 
etah, 2001). 

However, such analyses are still far from being routine. First, signaling 
proteins are often of low abundance, and to complicate things only a 
fraction of a given protein may be phosphorylated at a given time, making 
the detection of the corresponding phosphorylation sites very challenging 
(Salih, 2004). 
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Second, it is often difficult to preserve phosphorylated amino acid 
residues as phosphorylations on amino groups and acyl-phosphorylations 
are extremely acid labile, while phosphorylations on hydroxyl groups are 
base sensitive. Therefore, special precautions need to be taken when pro- 
cessing the samples (Salih, 2004). 

Third, changes in the state of the phosphoproteome are highly dynamic; 
therefore, it is essential to preserve the cellular phosphorylation state of 
interest (Gruhler et ah, 2005; Olsen et ah, 2006). We found, for example, 
that the widely applied washing of yeast cells with ice-cold, low molarity 
buffers alone triggers both the starvation and the osmotic shock responses, 
while centrifuging yeasts triggers their stress response, all changing the 
phosphoproteome. Fourth and finally, challenges associated to the MS- 
based analysis of protein phosphorylation still exists, among them the 
impaired fragmentation of phosphopeptides under collision-induced disso- 
ciation (CID) in ion trap instruments or their nonstandard chromatographic 
behavior (Macek et ah, 2009). 

We have developed an MS approach based on label-free quantification 
that allows to comprehensively, quantitatively, and reproducibly monitor- 
ing a significant fraction of the phosphoproteome in yeast (> 10,000 phos- 
phorylation sites on >2000 phosphoproteins). We have used this approach 
to successfully detect changes in the phosphoproteome upon the inhibition 
of the Tor kinase using a specific inhibitor, rapamycin (Huber et ah, 2009), 
and to determine the first phosphorylation network of yeast, connecting 
most kinases, phosphatases with their responders in vivo (Bodenmiller et ah, 
submitted) . 

The approach presented here (for an overview see Fig. 13.1), even 
though optimized for yeast, is with minor adaptations generally applicable 
to detect and quantify changes in the phosphoproteome of any organism, 
with high throughput and low effort, upon any stimulus of interest. 

In the following, we use the comparison between yeast wild-type cells 
and yeast kinase gene deletion mutant cells to illustrate our workflow. 




2. Protocols 

2.1. Generation of peptide samples 

For any phosphoproteomic approach it is essential to preserve the in vivo 
phosphorylation state of the cells. For that purpose we established a method 
in which trichloroacetic acid (TCA) is added directly into the growth 
medium to final 6% before the yeast harvest (Urban et ah, 2007). TCA 
rapidly enters the cells and first, lowers the pH to ~1.5 and thereby 
protonates all proteins (including kinases and phosphatases) and second, 
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Figure 13.1 An experimental workflow to quantify changes in the phosphoproteome. 
It consists of the following steps: (i) triplicate cultures of the reference and cell state of 
interest are grown; (ii) phosphopeptides are highly selectively isolated from the 
corresponding proteome digests; (iii) for each phosphopeptide isolate, LC-MS(/MS) 
maps (also called phosphorylation patterns) are generated; (iv) the phosphoproteome 
patterns are compared and correlated using the algorithms SuperHirn (Mueller et ah, 
2007) and Corra (Brusniak et ah, 2008). This analysis yields phosphopeptide features 
which are significantly regulated upon the stimulus of interest; (v) if significantly 
regulated phosphopeptide ions could not be annotated with an amino acid sequence 
in (iii), they are reanalyzed using targeted LC-MS/MS; (vi) a list of phosphopeptides 
displaying a significant abundance change between the reference cells and cells of 
interest is generated for further analyses. 



denatures them, thus quenching all enzymatic activity and preserving the 
cellular phosphorylation state in vivo (Urban et ah, 2007). 

To perform the subsequent phosphopeptide isolation, ~3 mg of cellular 
protein lysate are needed. Such amount can be easily isolated from ~ 50 ml 
of a yeast culture grown to OD 600 0.8—1.0. In case system-wide phospho- 
peptide quantification is performed, it is strongly recommended to grow 
biological triplicates, allowing to compute solid statistical significances 
(Brusniak et ah, 2008). Finally, we propose to use synthetic defined (SD) 
medium to culture yeast as, first, the exact composition of the medium can 
be controlled and, second, protein contaminations deriving from YPD 
medium can be avoided. 
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2.1.1. Medium and growth conditions 

Saccharomyces cerevisiae strains are streaked out on appropriate plates and three 
replicates (from three single colonies), both of the wild-type and the gene 
deletion strain, are grown in a 1-ml preculture over night at 30 °C in SD 
medium (per liter: 1.7 g yeast nitrogen base without amino acids (Chemie 
Brunschwig, Basel, Switzerland), 5 g ammonium sulfate, 2% glucose (w/v), 
0.03 g isoleucine, 0.15 g valine, 0.04 g adenine, 0.02 g arginine, 0.02 g 
histidine, 0.1 g leucine, 0.03 g lysine, 0.02 g methionine, 0.05 g phenylala- 
nine, 0.2 g threonine, 0.04 g tryptophan, and 0.03 g tyrosine). The pre- 
culture is then used to inoculate 50 ml of SD medium in a 500-ml shaking 
flask to an OD 600 of 0.05. 

2.1.2. Harvest of the yeast cells 

At an OD 600 of 0.8, 100% (w/v) TCA is added to a final concentration of 6% 
directly into the yeast cultures (Urban et ah, 2007). The flasks containing the 
cultures are put for 10 min into ice water (all subsequent steps are performed 
at 4 °C). The cultures are then transferred into 50 ml tubes and the yeasts are 
pelleted by centrifugation at 1500x^. The supernatant is discarded and the 
cells are resuspended in 10 ml ice-cold acetone (at this step the yeast form 
aggregates, resembling snowflakes). Then the cells are again pelleted at 
1500X£ and the supernatant is discarded. This step is repeated once and 
then the yeast pellet is transferred to a 2-ml safe-lock (important!) Eppendorf 
tube and can be stored at — 80 °C until further processing. 

2.1.3. Lysis of yeast cells and peptide generation 

After removing residual acetone by pipetting, 800 jA of a solution consisting 
of 8 M urea, 50 mM ammonium bicarbonate and 5 mM EDTA is added to 
each yeast pellet (3 replicates X 2 strains). In addition, acid- washed glass 
beads (500 fim diameter) are added to a final volume of 1.5 ml. Then cells 
are lyzed by beat beating (five times beating for 1 min, with 1 min breaks on 
ice between consecutive beatings). After that step, the cell debris is pelleted 
by centrifugation at 1 6,000 x^ at room temperature and the supernatant is 
collected. Again 800 jA of the urea solution is added and the bead beating 
procedure described above is repeated. Then the protein concentration of 
the pooled supernatants of each replicate is determined using a BCA protein 
assay kit (Pierce, Rockford, IL, USA). For each replicate, 3 mg of protein is 
reduced for 30 min by 5 mM tris(2-carboxyethyl)phosphine (TCEP) and 
alkylated for 1 h using 10 mM iodoacetamide. After diluting the urea 
solution with 50 mM ammonium bicarbonate to < 1.5 M, trypsin is added 
in a ratio of 1:125 (w/w) and the digestion is performed for 8 h at 37 °C 
(each step (reduction, alkylation, and digestion) is very insensitive to the 
total volume and have been successfully performed in several microliters up 
to 100 ml). 
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2.1.4. Purification of peptides 

Prior to the phosphopeptide isolation, peptides have to be cleaned using 
reverse phase chromatography. First, the pH of the digestion mixture is 
lowered to pH 2.5 using 10—100% trifluoroacetic acid (TFA). Of note: TFA 
(and also TCA) solutions > 1% must always be stored in glass containers but 
never in plastics as otherwise high amounts of polymers are generated which 
impede the mass spectrometry measurements. For the same reason TFA/ 
TCA solutions > 1% must always be added with a glass syringe. Second, an 
appropriate reverse phase column (e.g., CI 8 Sep-Pak with 500 mg resin, 
Waters, Milford, MA, USA) is wetted with acetonitrile and equilibrated 
using 0.1% TFA, 2% acetonitrile. After the peptides have been bound to the 
resin in the columns, they are washed with 0.1% TFA, 2% acetonitrile, and 
the peptides are eluted using 0.1% TFA, 40% acetonitrile. Finally, the eluent 
is dried in a vacuum concentrator to completion. 



2.2. Phosphopeptide isolation 

The methods commonly used for the selective isolation of phosphopeptides 
can be grouped into affinity based and chemical methods. In the following, 
three different protocols are described: the first one is based on immobilized 
metal affinity chromatography (IMAC) using Fe(III) metal ions (Andersson 
and Porath, 1986; Bodenmiller et ah, 2007a; Ficarro et ah, 2002), the 
second one, which is also based on affinity chromatography, uses a titanium 
dioxide (Ti0 2 ) resin (Andersson and Porath, 1986; Bodenmiller et ah, 
2007b; Larsen et ah, 2005; Pinkse et ah, 2004) and the third one uses 
phosphoramidate chemistry (PAC) (Bodenmiller et ah, 2007c; Tao et ah, 
2005; Zhou et ah, 2001). 

Each of the methods, if correctly used, is highly specific and reproduc- 
ible for the isolation of phosphopeptides, and therefore suitable for label- 
free quantification, however, isolating distinct yet overlapping parts of the 
phosphoproteome. Therefore, it is advantageous to apply all of the methods 
for a given sample to increase the covered phosphoproteome (Bodenmiller 
et ah, 2007b). In addition, the following specific strengths and weaknesses 
exist. First, for both affinity-based methods the under loading of the resin 
will result in low specificity of phosphopeptide isolation. We found that 
PAC is more tolerant against this event (Bodenmiller et ah, 2007b). Second, 
both Ti0 2 and PAC have a strong bias toward singly phosphorylated 
phosphopeptides, while in the case of IMAC the amount of peptide loaded 
per resin will influence if singly or multiply phosphorylated peptides are 
isolated (Bodenmiller et ah, 2007b; Thingholm et ah, 2008, 2009). Third, of 
the three methods PAC is the most reproducible while IMAC is the least 
one. Fourth and finally, in terms of practicability, Ti0 2 is fast, straight 
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forward, and tolerant to impurities and experimental errors. Therefore, we 
propose to start with that method. However, also IMAC and PAC are easy 
to master and only need half a day more to perform. 

For all methods the correct pH at different steps in the protocol is crucial 
for the success and reproducibility of the isolations. In addition, for the 
affinity chromatography-based methods the optimum peptide to resin ratio 
is essential for their specificity (Bodenmiller et ah, 2007b). 

2.2.1. Phosphopeptide isolation using a Ti0 2 resin 

Three milligrams of dried peptides are reconstituted in 280 fA of a washing 
solution (WS) consisting of 80% acetonitrile, 3.5% TFA which is saturated 
with phthalic acid (~ 100 mg phthalic acid/ml, precipitate must be visible at 
the bottom of the vial after vigorous shaking). Then 1.25 mg Ti0 2 (GL 
Science, Saitama, Japan) resin is placed into a 1-ml Mobicol spin column 
(MoBiTec, Gottingen, Germany) and is subsequently washed with 280 fA 
water, 280 fA methanol, and finally is equilibrated with 280 fA WS for at 
least 10 min (the liquid is always removed by centrifugation using 500 x^). 
After removal of the WS, the peptide solution is added to the equili- 
brated Ti0 2 in the blocked Mobicol spin column and is incubated for 
> 30 min with end-over-end rotation. After the incubation step the peptide 
solution is removed by centrifugation, and the resin is thoroughly washed 
two times each with 280 jA of the WS, with a 80% acetonitrile, 0.1% TFA 
solution, and finally with 0.1% TFA. In the final step, phosphopeptides are 
eluted from the Ti0 2 resin using two times 150 fA of a 0.3-M NH 4 OH 
solution (pH rsj 10.5). After elution, the pH of the pooled eluents is rapidly 
adjusted to 2.7 using 10% TFA, and phosphopeptides are purified using an 
appropriate reverse phase column suitable for up to 20 fig peptide (e.g., CI 8 
Micro Spin Column, Nest Group or Harvard Apparatus, MA, USA; see 
above for the detailed protocol). 

2.2.2. Phosphopeptide isolation using IMAC 

Three milligrams of peptides are reconstituted in 280 fA of a WS consisting 
of 250 mM acetic acid with 30% acetonitrile at pH 2.7. Then 60 fA of 
uniformly suspended PHOS-Select iron affinity gel (Sigma Aldrich), 
corresponding to ~30 fA resin, is placed into a 1-ml Mobicol spin column 
(MoBiTec) using a cut pipette tip. The resin is washed and thereby equili- 
brated by using three times 280 fA of the WS (the liquid is always removed 
by centrifugation at 500 x^). After removal of the WS the peptide solution is 
added to the equilibrated IMAC resin in the blocked Mobicol spin column. 
To obtain reproducible results it is crucial that the pH in all replicate samples 
is maintained at ~2.5. The affinity gel is then incubated with the peptide 
solution for 120 min with end-over-end rotation. After the incubation, 
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the liquid is removed by centrifugation and the resin is thoroughly washed 
two times with 280 fA of the WS, and once with ultrapure water. In the final 
step, phosphopeptides are eluted once using 150 fA of a 50-mM phosphate 
buffer (pH 8.9) and once using 150 jA of a 100-mM phosphate buffer (pH 
8.9), each time incubating the resin <3 min with the elution buffer. Both 
elutes are pooled, the pH is rapidly adjusted to 2.7 using 10% TFA, and 
phosphopeptides are purified using an appropriate reverse phase column 
(CI 8 Micro Spin Column, Nest Group or Harvard Apparatus; see above for 
the detailed protocol). 

2.2.3. Phosphopeptide isolation using phosphoramidate chemistry 

In the first reaction step, the peptides are methyl esterified. Here it is 
essential to use water free reagents and to work under dry conditions, 
otherwise unwanted side reactions can take place (He et ah, 2004). In detail, 
3 mg of dried peptide are reconstituted in 1.5 ml of methanolic HC1 which 
was prepared by slowly adding 240 fA of acetyl chloride to 1.5 ml of 
anhydrous methanol (careful — strong heat development). The methyl 
esterification is then allowed to proceed at 12 °C for 120 min. The solvent 
is quickly removed in a cool vacuum concentrator and peptide methyl esters 
are dissolved in 120 fA methanol, 120 fA water, and 240 fA acetonitrile. 

Using glass beads as a solid phase (see Fig. 13.2): 750 fA of a solution 
containing 50 mM N-(3-dimethylaminopropyl)-A^ / -ethylcarbodiimide 
(EDC), 100 mM imidazole (pH 5.6), 100 mM2-(N-morpholino)ethanesul- 
fonic acid (MES) (pH 5.6), and 2 M cystamine are added to the peptide 
solution (Bodenmiller et ah, 2007b). 

Of note: for the success of the reaction the correct pH is crucial. It is, 
therefore, strongly recommended to check the pH of the final reaction 
solution containing the peptides using a micro-pH electrode and, if neces- 
sary, adjust the pH to 5.5—6.0 (Bodenmiller et ah, 2007c). 

The reaction is allowed to proceed at room temperature with vigorous 
shaking for 8 h. The solution is then loaded onto an appropriate reverse 
phase column (CI 8 Sep-Pak with 500 mg resin, Waters) and the derivatized 
peptides are, first, washed with 5 ml 0.1% TFA, second, treated with 3 ml 
10 mMTCEP (pH should be adjusted to ~3 prior to loading using sodium 
hydroxide (NaOH)) for 8 min, in order to produce free thiol groups, third, 
washed again with 5 ml 0.1% TFA to remove residual TCEP, and fourth 
and finally, the derivatized peptides are eluted with 1.6 ml 80% acetonitrile, 
0.1% TFA, and the pH is adjusted to 6.0 using phosphate buffer pH 8.9. 

The acetonitrile/water solution is partially removed in the vacuum 
concentrator to a final volume of ^500 fA and the derivatized phosphopep- 
tides are incubated with 5 mg maleimide functionalized-glass beads for 1 h 
at pH 6.2 in a Mobicol column. (The beads are synthesized by dissolving 
120 /imol hydroxybenzotriazole, 120 /imol of 3-maleimidopropionic acid, 
and 120 /imol diisopropylcarbodiimide in 1 ml of dry dimethylformamide 
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Figure 13.2 Schematic illustration of the procedure to isolate phosphopeptides using 
phosphoramidate chemistry. It consists of the following steps: First, the carboxylate 
groups of the peptides are protected using methyl esterification. Second, the phosphate 
groups are derivatized with cystamine. Third, the cystamine is reduced using TCEP, 
thereby generating free thiol groups. Forth, the derivatized phosphopeptides are cou- 
pled to a maleimide derivatized solid phase (e.g., amino propyl containing glass beads). 
Fifth and finally, nonphosphopeptides are removed by washing the solid phase and 
phosphopeptides are released from the solid phase using acidic conditions (from 
Bodenmiller et ah, 2007c). 



completely. After 30 min incubation, 100 mg CPG beads (Proligo Bio- 
chemie, Hamburg, Germany) corresponding to 40 /imol free amino groups 
are added for 90 min. After the reaction they are washed using dimethyl- 
formamide and dried using a vacuum concentrator. Beads are stored dry 
at 4 °C.) 

Then the derivatized beads are washed two times sequentially with 
300 fA 3 M NaCl, water, methanol and, finally, with 80% acetonitrile to 
remove nonspecifically bound peptides. In the last step, the beads are 
incubated with 300 fA 5% TFA, 30% acetonitrile for 1 h to recover the 
phosphopeptides. The recovered sample is dried in the vacuum concentra- 
tor and is reconstituted in an appropriate solvent for the LC— MS (/MS) 
analysis. 

Using amino -derivatized dendrimer as a solid phase: For the phosphorami- 
date reaction, dissolve the peptide methyl esters in 600 fA of reaction 
solution (50 mM EDC, 100 mM imidazole, 100 mM MES, and 1 M 
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PAMAM dendrimer Generation 5 (Dendritech Inc., Midland, MI, USA), 
pH 5.5), and incubate at the room temperature with strong shaking for 18— 
24 h (Tao etal., 2005). 

To remove unspecifically bound peptides after the reaction, transfer the 
reaction solution into a membrane filter device (molecular mass cut-off of 
5000) in which the membrane side is inward or perpendicular. Then wash 
the dendrimer three times with 2 MNaCl, 2 MNaCl in 50% methanol, and 
three times with 50% methanol (for all steps mix well and make sure that the 
dendrimer is dissolved) always using an appropriate amount of the solvents 
that allow for vigorous vortexing. Discard the flow through to remove 
nonspecifically attached/bound nonphosphopeptides, and incubate the 
dendrimer with 300 jA 5% TFA for 1 h to break the phosphoramidate 
bonds and thereby recovering the phosphopeptides. 

Spin down and collect the eluent. Add 150—300 fA 50% methanol to the 
dendrimer, mix well, and spin down to pool the filtrates. The recovered 
sample is dried in the vacuum concentrator and is reconstituted in an 
appropriate solvent for the LC— MS (/MS) analysis. 

2.3. Mass spectrometric analyses of the 
phosphopeptide isolates 

As the phosphopeptide quantification described in this chapter is based on a 
label-free approach, it is important that the mass spectrometric measure- 
ments are performed on high mass accuracy, high-resolution instruments 
such as the hybrid LTQ-Orbitrap, LTQ-FT, or quadrupole time-of-flight 
(QTOF) and that the chromatographic retention time is highly 
reproducible. 

Ideally, the phosphopeptides are separated using an LC system employ- 
ing a nanoflow (200—300 nl/min) acetonitrile/water gradient using a 
reverse phase resin. As phosphopeptides are on average more hydrophilic 
than their nonphosphorylated counterparts, it is advisable to, first, use CI 8 
beads with 3 /im diameter or less, as the retention and separation of 
phosphopeptides is increased, and second, to adopt the acetonitrile/water 
gradient normally used for the corresponding nonphosphopeptides, 
for example, by starting with 2% and ending with 24% acetonitrile over 
60—90 min and by doubling the used formic acid concentration to 0.2%. 

An example LC setup consists of an Eksigent nano LC system (Eksigent 
Technologies, Dublin, CA, USA), equipped with a 11-cm fused silica 
emitter, 75 /im inner diameter (BGB Analytik, Bockten, Switzerland), 
packed with Magic CI 8 AQ, 3 jim beads, loaded from a cooled (4 °C) 
Spark Holland auto sampler. 

Example MS settings to analyze phosphopeptides on an LTQ-Orbitrap- 
MS are the following: each MSI scan is acquired at 60,000 FWHM nominal 
resolution settings, with an overall cycle time of ~1.2 s. For injection 
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control, the automatic gain control (AGC) is set to 5 X 10 ions for a full 
Orbitrap-MS (1 X 10 ions for the linear ion trap MS/MS). The instru- 
ment is calibrated externally, according to the manufacturer's instructions. 
To ensure a high mass accuracy, the internal lock mass calibration at ml z 
429.088735 and 445.120025 should be used. 

Often peptides phosphorylated on a serine or threonine residue exhibit a 
loss of phosphoric acid in ion trap-based CID MS2 spectra, severely impair- 
ing the phosphopeptide backbone fragmentation, thereby reducing the 
ability of database searching algorithms to unambiguously identify the 
phosphopeptide (see Fig. 13.3). To address this problem several strategies 
have been developed. The first one is called MS /MS /MS (MS3) experi- 
ment (Beausoleil et ah, 2004). Here the dominant neutral loss fragment 
ion from the MS2 measurement is resubjected to CID, and the resulting 
MS3 spectrum is used to identify the phosphopeptide (see Fig. 13.3). The 
second strategy is called multistage activation (MSA or "pseudo MS3") 
(Schroeder et ah, 2004). Here the neutral loss ion is collisionally activated 
while the fragments from the precursor ion are still present in the ion 
trap. As a result, the spectrum contains ions from the precursor fragmenta- 
tion and the neutral loss product (hence "pseudo MS3"). The main advan- 
tages of this method are that, first, a higher number of fragment ion 
spectra compared to the MS3 strategy can be recorded in the same 
time and, second, that the subsequent data processing is facilitated 
compared to the MS2/MS3 strategy (see below). Finally, the last method 
to fragment phosphopeptides in ion trap instruments was recently devel- 
oped and is based on electron transfer dissociation (ETD) (Syka et ah, 2004). 
Here radical anions are used to transfer electrons to multiply charged 
peptides stored in the ion trap. As a result, the peptide backbone is frag- 
mented by a nonergodic process, yielding extensive fragmentation of the 
phosphopeptide backbone. Particularly, important for the analysis of phos- 
phopeptides is the fact that the phosphate group remains attached to the 
phosphopeptide during the ETD fragmentation process, facilitating the 
unambiguous site assignment. 

As the phosphopeptides identified by CID and ETD only partially 
overlap, it is also recommended to use both methods if available, to increase 
the number of identified phosphopeptides (Molina et ah, 2007). 

Example settings for the different fragmentation methods are as follow- 
ing: for all three of them, the 3—6 most intense phosphopeptide ions 
identified in the MSI measurement are typically selected for fragmentation 
(importantly, the MS2 measurements should not delay the recording of the 
MSI spectra). For the CID-based methods (MS2, MS3, and MSA), the 
isolation width of the parental ion is set to 2 ml z, the normalized collision 
energy to 35, the activation Q to 0.25 and the activation time to 30—100 ms. 
For the MS3 and MSA experiments, the neutral loss of — 98 Da for singly, 
— 49 Da for doubly, —32.7 Da for triply, and —24.5 Da for quadruply 



328 



Bernd Bodenmiller and Ruedi Aebersold 



100 

90 

80 

8 70 

a 
a 

-§ 60 

ii 50 

> 

•& 40 

13 
Pi 30 

20 

10 



743.58 



MS2 



260.25 355.17 424.83 476.42 



iii i pi i'ii | i""i" ' i " 'i |' r i 
200 300 400 500 



620.08 686.67 



810.17 904.83 



1010.08 



1123.58 1244.67 1302.92 1373.00 



I l'| H I' " r'T ' t ' T "! | - -| tJ " | -I ' 1 y 'i' Y ' Vl"" ! " f~pT T" I " I | " I' I I I | I I I I |' I I I I | 



600 



700 



800 
mlz 



900 



1000 



1100 



1200 



1300 



1400 



810.50 



100 



90 



80 

OJ 

d 70 
§ 60 

X) 

s 50 

> 

u 
30 



20 

10 





MS3 



703.00 



443.25 

476.33 



620.50 
590.00 



227.08 296.08 



200 300 



542.42 



I y Ul[, J Ll l^lyl^L lL|LuL^I 1,1 H,i lll,| 



400 



500 



600 




915.50 

930.08 



1010.67 

1029.08 1123.67 1199 25 1338 - 75 
|p" I r __! I 1350.58 



1000 



T 
1100 



1200 



1300 



1400 



100 



90 



80 



8 70 

d 
a 

■g 60 

^ 50 

> 

■tf 40 

13 
C* 30 



20 

10 





810.58 



MSA 



904.50 



930.08 
951.58 



227.00 



326.17 



. I.j.1 i,.!..La.,u*. l ,.A,i,,.l 



200 



300 



400 




425.25 



010.58 

1029.08 n29 _ 08 



1241.42 



1338.58 



4p,J|,UiL|ii^lL l i)i..i^i,.L^,,i|,,li,^.|iiii,u.j|,i,|,l 



1100 1200 1300 1400 



Figure 13.3 Phosphopeptide fragmentation spectra. The MS2, MS3, and MSA spectra 
of the phosphopeptide with the parental mass at m/z 767.89 are shown. 

charged peptides must be defined to trigger the additional fragmentation 
step. Typical settings for an ETD experiment performed in an ion trap are 
analogous to the CID methods except that many more ions are collected in 
the ion trap prior to fragmentation (80,000 vs. 30,000). 
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2.4. Data analyses 

2.4.1. Database search 

To annotate the phosphopeptide fragmentation spectra with the correspond- 
ing peptide sequences, sequences database searches must be performed (Eng 
et ah, 1994). Currently, a wide variety of algorithms exist, all with specific 
strengths and weaknesses, but it has been shown that the Sequest (Eng et ah, 
1994) and Mascot (Perkins et ah, 1999) algorithms perform particularly well in 
case of phosphopep tides (Bakalarski et ah, 2007). 

After the database search has been completed, it is necessary to estimate 
the false positive rate of phosphopeptide identifications. For such analyses, 
either the statistical tool PeptideProphet (Keller et ah, 2002, 2005) or a 
decoy database can be used (Elias and Gygi, 2007). 

To perform the database search, the MS data must be converted to the 
centroid mzXML or mzML format (http://tools.proteomecenter.org/ 
software. php). For yeast, the data is typically searched against a decoy 
version of the SGD nonredundant database (www.yeastgenome.org). 

General settings for the database search are as follows: for the in silico 
digest of the SGD database, trypsin is defined as protease, cleaving after K 
and R (if followed by P, the cleavage is not allowed). As phosphorylation 
sites close to an R or K impair the tryptic cleavage, at least two missed 
cleavages should be allowed in addition to one nontryptic terminus. The 
peptide mass is set to 600—6000 Da (if ETD data is searched, the mass range 
is extended to 12,000). The precursor ion tolerance is set to <20 ppm 
(depending on the MS instrument used), and the fragment ion tolerance is 
set to 0.5—0.8 Da. In addition, cysteine is defined with a fixed carboxyami- 
domethylation modification (+57.0214 Da), and the phosphate group must 
be defined as a variable modification of serine/ threonine (MS2, ETD, MSA 
+ 79.9663 Da; MS3 -18.01528) and tyrosine (MS2, MSA, MS3, ETD 
+ 79.9663 Da). Finally, y and b fragment ions are defined for all CID-based 
data, while c and z fragment ions are defined for ETD. 

The results obtained from the database search are then subjected to 
statistical filtering using PeptideProphet (http://tools.proteomecenter.org/ 
software, php), and a PeptideProphet cut-off of 0.9 is typically set in order to 
classify peptides as correctly or incorrectly identified. Alternatively, the false 
discovery rate (FDR) can be determined using the decoy entries of the 
decoy database (typically cut-offs between 1% and 5% are used). 

2.4.2. Quantification 

Two points should be kept in mind for the label-free quantification 
(Mueller et ah, 2007, 2008; Rinner et ah, 2007). First, an essential point 
throughout the entire experimental pipeline is reproducibility, in terms 
of the generation of the phosphopeptide isolates and their analyses using 
LC— MS (e.g., retention time and mass accuracy). Second, if subtle 
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regulatory events shall be identified (e.g., 50% upregulation of a phosphor- 
ylation site), more than three biological triplicates should be processed and 
analyzed, to increase the sensitivity of the statistical analyses. If higher 
precision is required stable isotope labeling-based quantification methods 
might be preferably applied. 

A typical data analysis workflow applied in label-free quantification 
consists of the following steps (Mueller et ah, 2007, 2008; Rinner et ah, 
2007). First, the MS data is converted into the profile mzXML format 
(http://tools.proteomecenter.org/software.php). Second, phosphopeptide 
ion peaks are detected ("feature detection"). Third, the phosphopeptide ion 
peak areas are computed based on the LC— MS data. Fourth, the peaks are 
aligned over the analyzed LC— MS runs. Fifth, each peak is annotated with 
the phosphopeptide sequence and, sixth, the ratio between the phospho- 
peptide ions derived of the wild-type and kinase gene deletion mutant is 
computed. 

A widely used and freely available tool to perform these tasks is called 
SuperHirn. Typical parameters used for the analyses of LTQ-Orbitrap data, 
their description and the download of the tool can be found under the link 
http://tools.proteomecenter.org/wiki/index.php?title=Software:Super Hirn. 

Of note, the peptide identifications must be imported into the Super- 
Hirn analysis using the output format (.pepXML) of the PeptideProphet. 

2.4.3. Computation of statistical significances of observed 
regulations 

Based on the SuperHirn output file (called MasterMap), which contains all 
detected phosphopeptide ions derived from the triplicate wild-type samples 
and the triplicate mutant samples along with the sequence annotation of the 
identified phosphopeptides, the significance of an observed abundance 
variation is computed. Before the significance analysis, the phosphopeptides 
in the MasterMap are separated into different statistical classes. First, a class 
consisting either of phosphopeptides for which the MSI signal was detected 
in all replicates (three times wild-type and three times the mutant samples) 
or of phosphopeptides for which 1—5 signals are missing. In the category 
"3 signals missing" the phosphopeptides are further separated between 
(i) the signal is reproducibly present in either all wild- type or all mutant 
samples and (ii) the signal is spread over all wild- type and mutant samples. 
Before statistical analysis, the missing data values are computed using the 
integrated background noise given by the LC— MS analysis and determined 
using SuperHirn as a baseline. These datasets are then further analyzed using 
the freely available software tool called Corra (Brusniak et ah, 2008) (http:// 
tools.proteomecenter.org/Corra/corra.html). It wraps around the Limma 
software package (http://www.bioconductor.org/) and performs an Empir- 
ical Bayes in alternative to the Welsh £-test, yielding a p- value that is further 
adjusted for multiple comparisons, according to the procedure by 
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Figure 13.4 Example volcano plot as generated by the Corra analysis. Each dot 
corresponds to a measured phosphopeptide ion. The y-axis shows the Log Odds that 
the observed regulations are true and the x-axis the change of a phosphopeptide ion in 
the log 2 scale. 



Benjamini and Hochberg (1995) which controls the FDR. After this last 
analysis, the phosphopeptides which significantly change their abundance 
due to a changed activity of a kinase or phosphatase can be identified (see 
Fig. 13.4). 

These data can then be used, for example, to infer in which biological 
processes a given kinase is involved or which signaling pathways are 
affected. 

Overall, we found that such generated data is a strong and solid basis for 
follow up, in depth, experiments. 
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Abstract 

Protein-fragment complementation assays (PCAs) are a family of assays for 
detecting protein-protein interactions (PPIs) that have been developed to 
provide simple and direct ways to study PPIs in any living cell, multicellular 
organism or in vitro. PCAs can be used to detect PPI between proteins of any 
molecular weight and expressed at their endogenous levels. Proteins are 
expressed in their appropriate cellular compartments and can undergo any 
posttranslational modification or degradation that, barring effects of the PCA 
fragment fusion, they would normally undergo. Applications of PCAs in yeast 
have been limited until recently, simply because appropriate expression plas- 
mids or cassettes had not been developed. However, we have now developed 
and reported on several PCAs in Saccharomyces cerevisiae that cover the gamut 
of applications one could envision for studying any aspect of PPIs. Here, we 
present detailed protocols for large-scale analysis of PPIs with the survival- 
selection dihydrofolate reductase (DHFR) reporter PCA and a new PCA based on 
a yeast cytosine deaminase reporter that allows for both survival and death 
selection. This PCA should prove a powerful way to dissect PPIs. We then 
present a method to study spatial localization and dynamics of PPIs based on 
fluorescent protein reporter PCAs and finally, two luciferase reporter PCAs that 
have proved useful for studies of dynamics of PPIs. 




1. Introduction 

In the protein-fragment complementation assay (PCA) strategy, 
protein— protein interactions (PPIs) are measured by fusing each of the 
proteins of interest to complementary N- or C-terminal peptides of a 
reporter protein that has been rationally dissected using protein engineering 
strategies (Michnick et ah, 2000; Pelletier and Michnick, 1997; Pelletier 
et ah, 1998). The reporter protein fragments are brought into proximity by 
interaction of the two interacting proteins, allowing them to fold together 
into the three-dimensional structure of the reporter protein, thus reconsti- 
tuting the activity of the reporter (Fig. 14.1). PCAs have been created with 
many different reporter proteins and thus provide for different types of 
readouts, depending on the desired application. This generality means that 
PCA is not a single reporter assay, but rather a toolkit. PCAs have also been 
developed to study spatial and temporal changes in PPIs under different 
conditions and also survival-selection assays that provide a simple readout 
for large-scale systematic analyses of protein interaction networks or 
directed evolution experiments (reviewed in Michnick et ah, 2007). Finally, 
there are two unique features of PCAs we must note: first, by nature of the 
fact that interactions between two proteins must occur in such a way that 
the reporter protein can fold, PCAs can provide structural and topological 
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Figure 14.1 Conceptual basis of protein-fragment complementation. The spontane- 
ous unimolecular folding of a protein from its nascent polypeptide (upper) can be made 
a protein-protein interaction-dependent bimolecular process by fusing two interacting 
proteins to one or the other complementary N- or C-terminal peptides into which a 
protein has been dissected (lower). PPI-mediated folding of a reporter protein from its 
complementary fragments results in reconstitution of reporter protein activity. 



details of how a PPI is formed or if such complexes undergo conformation 
changes under specific conditions (Remy et ah, 1999; Tarassov et ah, 2008). 
Second, contrary to intuition, most PCAs are fully reversible, allowing for 
direct studies of the dynamics of both formation and disruption of PPIs. 




2. General Considerations in Using PCA 



Measuring PPI in living cells by any method entails that one recon- 
sider any suppositions that we may have about the nature of a PPI, particu- 
larly if it has only been studied with in vitro methods and most importantly 
by indirect methods such as affinity or immunopurification. PCAs detect 
direct binary or indirect proximal interactions between proteins and thus, if 
it is assumed that there is such an interaction based on experiments that only 
suggest association of proteins in a complex, it is possible that no interaction 
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will be detected. Our advice is: "life is short, experiment." However, we 
can make some general statements about what to consider when setting up 
any PCA experiment in order to maximize the probability of a successful 
outcome. 

First, we consider the sensitivity of PCAs. Like any analytical technique, 
the sensitivity of the assay depends on the sensitivity of the detection 
method and background signal that may arise from cells. Regardless of the 
properties of the reporters, the range of signal detectable will depend in all 
cases on the quantity of complexes formed, which in turn is determined by 
the abundances of the proteins studied and their affinity for each other. We 
have only explored these parameters in great detail for the dihydrofolate 
reductase (DHFR) PCA (Section 3). We have demonstrated that for this 
simple survival-selection assay, the number of complexes needed to support 
survival under the selection conditions was as low as approximately 25 per 
cell (Remy and Michnick, 1999) for a complex for which the dissociation 
constant was in the range of 1 nM. We recently showed that we could 
generalize this result across a proteome, demonstrating that the distribution 
of detected interactions covered the range of protein abundances down to 
less than 100 molecules per cell (Tarassov et ah, 2008). We have also shown 
that an upper limit of the dissociation constant for detection of PPI is 
likely in the range of 10 /iM-100 \iM for the DHFR (Campbell- Valois 
et al, 2005) and OyCD (Section 4) PCAs (Ear and Michnick, 2009). These 
observations suggest that PPI can be detected by PCA within ranges of 
protein abundances and complex affinities that are commonly observed. 
However, PPI may or may not bedetected depending on the PCA reporter 
used. For instance, a PPI studied with a fluorescent protein-PCA reporters 
(Section 5) might not be detected if the abundance of complexes is lower 
than necessary to reconstitute enough fluorescent proteins. In this case, 
signal will not be high enough to overcome background fluorescence of 
cells in the range of wavelengths over which the fluorophore emits. On the 
other hand, there are no background issues for luciferase-based PCAs 
(Section 6) and thus detection is limited only by the sensitivity of the 
detector used. Finally, an issue of particular importance to studies in yeast 
where the complementary PCA fragments are fused to gene open reading 
frames (ORFs) by homologous recombination is whether the genes are 
hetero- or homozygous for the fusions in diploid cells. In this case, the 
untagged proteins (A and B) will compete for binding with those that are 
tagged (A 7 and B 7 ), resulting in a reduced number of reconstituted PCA 
reporter proteins and thus, reporter signal. Only the A 7 B ' complex (out of 
the four possible AB, AB 7 , A 7 B, and A 7 B 7 ) results in a reconstituted PCA 
reporter protein, leading to a fourfold reduction in signal. The number of 
reconstituted complexes necessary for signal detection in assays performed 
in diploid cells (Tarassov et ah, 2008) is, therefore, much lower than what is 
expected for the abundances of the interacting partners alone. 
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A second set of considerations in using PCA is how the fusion of 
complementary PCA reporter fragments could affect the proteins of interest 
and the ability to detect PPL First, as with any fusion construct, it is critical 
to test the fusions in established functional assays in order to assure that the 
tags themselves do not impair the function of the protein or lead to gain of 
function. One should also not assume that a functional fusion protein with a 
particular tag ensures that other PCA tags will lead to functional fusions. 
Different tags may have different effects. Second, we can ask if the orienta- 
tion of fusion (N- or C-terminus) or identity of the fragment may affect 
the outcome of a PCA experiment. This can only be determined empiri- 
cally. We have tested all possible combinations and permutations of tagging 
individual test proteins that are known to interact (8 total per protein pair) 
and found that in some cases it made no difference how the proteins 
were tagged while for others, only an individual arrangement worked 
(unpublished results). 

As we described above, PCAs are sensitive to whether the complemen- 
tary N- and C-terminal fragments can find each other in space and this 
depends on the distances between the termini of the interacting proteins to 
which the fragments are fused. To assure that PCA can occur, we typically 
insert a 10—15 amino acid flexible polypeptide linker consisting of the 
sequences (Gly.Gly.Gly.Gly.Ser) n between the proteins of interest and the 
PCA reporter protein fragments. We chose the (Gly.Gly.Gly.Gly.Ser) n 
linker because it is the most flexible possible and we have empirically 
observed that linkers of these lengths are sufficiently long to allow for 
fragments to find each other and fold, regardless of the sizes of the interact- 
ing proteins to which the fragments are fused (Remy and Michnick, 2001). 




3. DHFR PCA Survival-Selection for Large-Scale 
Analysis of PPIs 

The DHFR PCA was previously developed for Escherichia coli, plant 
protoplasts and mammalian cell lines (Pelletier et ah, 1998, 1999; Remy and 
Michnick, 1999; Subramaniam et ah, 2001) and has recently been adapted 
for large-scale screening of PPIs in yeast (Tarassov et ah, 2008). The 
principle of the DHFR PCA survival-selection assay is that cells lacking 
endogenous DHFR activity, here achieved by inhibiting the Saccharomyces 
cerevisiae scDHFR with methotrexate, are enabled to proliferate by simulta- 
neously expressing PCA fragments of a methotrexate-resistant DHFR 
mutant that are fused to interacting proteins or peptides. If the proteins 
interact and thus allow refolding of the DHFR reporter, cells that are grown 
in the presence of methotrexate can proliferate (Fig. 14.2) (Remy and 
Michnick, 1999). To adapt the DHFR PCA for high-throughput screening 
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Figure 14.2 (A) DHFR catalyzes the reduction of dihydrofolate to tetrahydrofolate, 
which is required for nucleotide and in some organisms, amino acid synthesis. This 
reaction can be inhibited by an antifolate, methotrexate. (B) In the DHFR PCA 
strategy, the two proteins of interest are fused to complementary fragments of a mutant 
DHFR that is insensitive to methotrexate. The PCA fragments are inactive in the 
absence of an interaction. If the proteins interact, the DHFR fragments are brought 
together in space and fold into the native structure, thus reconstituting the activity of 
the mutant DHFR and allowing cells to proliferate in the presence of methotrexate. 

in 5. cerevisiae, we created a double mutant (L22F and F31S) that is 10,000 
times less sensitive to methotrexate than wild-type scDHFR, while retain- 
ing full catalytic activity (Ercikan-Abali et ah, 1996). The assay can be used 
with strains harboring yeast expression vectors of the target protein ORF 
fused to PCA fragment coding sequence. It is also sensitive enough to be 
used with genomic recombinant strains, expressing proteins fused PCA 
fragment under the control of their endogenous promoters. We created 
two universal oligonucleotide cassettes encoding each complementary 
DHFR PCA fragment and two unique antibiotic resistance enzymes to 
allow for selection of haploid strains that have been successfully transformed 
and recombined with one or the other homologous recombination cassettes 
(Tarassov et ah, 2008). The resulting universal templates were used to create 
homologous recombination cassettes for most budding yeast genes by PCR 
using 5' and 3' oligonucleotides consisting of 40-nucleotide sequences 
homologous to the 3' end of each ORF (prior to the stop codon) and a 
region approximately 20 nucleotides from the stop codon. Below are 
protocols to perform DHFR PCA at a large scale with recombinant strains 
or with yeast transformed with expression plasmids. 
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3.1. Materials 

3.1.1. Reagents 

• Glycerol stocks of MATa. recombinant strains in which ORFs are fused to 
the complementary DHFR PCA F[l,2] fragment (Open Biosystems). 

• Glycerol stocks of MATol recombinant strains in which ORFs are fused 
to the complementary DHFR F[3] PCA fragment (Open Biosystems). 

• 3% agar solidified YPD medium in Nunc omniplates. 

• 3% agar solidified YPD medium with 100 /ig/ml nourseothricin for 
MATa recombinant strains (WERNER BioAgents, Jena, Germany) or 
250 /ig/ml hygromycin B for MATol recombinant strains (Wisent 
Corporation, Quebec, Canada) in omniplate. 

• 3% agar solidified YPD medium with both 100 /ig/ml nourseothricin 
(WERNER BioAgents) and 250 /ig/ml hygromycin B (Wisent Corpo- 
ration) in omniplate. 

• 4% noble agar (purified Agar, Bioshop) solidified synthetic complete (SC) 
medium with 200 /ig/ml methotrexate (prepared from a 10-mg/ml 
methotrexate in DMSO stock solution) in omniplate. 

3.1.2. Facultative 

Antibodies against DHFR fragments: anti-DHFR polyclonal antibody that 
specifically recognizes an epitope in the N-terminal F[l,2] fragment (Sigma 
D1067, 1:6000; Sigma-Aldrich, St. Louis, MO) and an anti-DHFR poly- 
clonal antibody that specifically recognizes an epitope in the C-terminal 
F[3] fragment (Sigma D0942, 1:5000; Sigma-Aldrich). 

3.1.3. Equipment 

• Pintool: Robotically manipulated (96 pintool) (0.910 mm flat round- 
shaped pins, AFIX96FP4, V&P Scientific Inc., San Diego, CA), 384 
pintool (0.356 mm flat round-shaped pins, custom AFIX384FP8 BMP 
Multimek FP8N, V&P Scientific Inc.) and a 1536-pintool (0.229 mm flat 
round-shaped pins, custom AFIX1536FP9 BMP Multimek FP9N, V&P 
Scientific Inc.) or manually manipulated (96 pintool) (1.58 mm, 1 jA slot 
pins, 45 mm, VP 408Sa, V&P Scientific Inc.). 

• Plate imaging: At least a 4.0-mega pixel camera (Powershot A520, Canon), 
a stationary arm (70 cm mini repro, Industria Fototecnica Firenze, Italy) 
and a plate-shooting platform. 



3.2. Procedure 

The general strategy for performing a screen is to generate an array of 
"prey" strains as indexed colonies grown in a regular grid on agar and 
then mate them with individual "bait" strains of the opposite mating type to 
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Figure 14.3 (A) The DHFR PCA screen is performed as show in this schematic. The bait 
reporter strain is incubated in liquid culture. The prey reporter strains are printed on solid 
medium and incubated to be used on multiple assay plates. The mating plate is produced by 
sequentially printing the bait strain and the prey strains on sold agar containing rich medium, 
allowing strains to mate and grow. Resulting haploids and diploid mixture strains are 
transferred to solid agar plates containing diploid selective medium. The resulting diploid 
strains can be transferred onto plates containing PCA survival-selection medium (contain- 
ing methotrexate) . (B) The resulting PCA survival selection plate, here a 6144-density plate 
grown for 2 weeks, can be imaged using a black velvet covered plate fixation platform and a 
basic digital camera. The image can be processed to remove plate sides, allowing image 
analysis to be performed only on the region containing colonies and images can be corrected 
for nonuniform illumination as described in (http://www.mathworks.com/products/ 
iniage/denios.htnil?file=/products/dernos/shipping/iniages/ipexrice.htrnl) and small 
objects, correspond to bubble, gel background and other anomalies can removed using 
the imopen function. Finally, the integrated pixel density is computed using pixel intensity, 
represented here as a color- or gray-coded scale, integrated on the area of each colony. 

select for diploids and then transfer these to a methotrexate-containing plate 
for survival selection (Fig. 14.3). The choice of whether to use the MATa or 
MATc/L strains as bait or prey is arbitrary. Here we describe a procedure in 
which the MATa strains are bait and MATa are the prey strains. Baits can 
also be expressed as fusions to DHFR PCA fragments from expression 
plasmids available in our lab and transformed into appropriate strains. 



3.2.1. Experimental procedure 

(1) Incubate individual bait strains picked from glycerol stocks into 45 ml 
liquid culture of strain selective media (YPD with 100 /ig/ml nourseo- 
thricin for MATa recombinant strains or 250 /ig/ml hygromycin 



Protein-Fragment Complementation Assays in Yeast 343 

B for MATcl recombinant strains) and allow culture to reach saturation 
at 30 °C. 

(2) Print prey strains picked from glycerol stocks onto a 35-ml agar solidified 
omniplate of strain selective media (3% agar + YPD with 100 /ig/ml 
nourseothricin for MATa. recombinant strains or 250 /ig/ml hygro- 
mycin B for MATd recombinant strains) using four 96 manual or robotic 
pintool prints for a total of 384 prints per plates and incubate 16hat30°C. 

Note: For the prey strain, step 2 can be repeated from the 384 prints 
to be transferred to a maximum of four other 1536 pintool prints per 
omniplate to achieve a density of up to 6144 colonies. 
Critical steps: 

• Centrifuge a saturated culture of bait strain at 500 X g for 5 min and 
resuspend in 15 ml of YPD. 

• Bait culture must be saturated to print enough cells for efficient 
mating on solid phase. 

• Pintool must be cleaned between each cell transfer. We soak the pins 
twice in a solution of 10% bleach containing glass beads followed by a 
10% bleach wash and two sterile water bath washes. 

(3) Transfer bait strain suspension in to an empty omniplate. 

(4) Print the bait strain suspension from the empty omniplate to a 35-ml 
agar solidified rich medium omniplate (YPD + 3% agar) at the same 
density as the prey strains using a pintool appropriate for the desired 
colony array density. 

(5) Transfer prey strains onto the bait strains on an omniplate containing 
35 ml-solid agar containing rich medium (YPD + 3% agar) using the 
appropriate pintool. Allow mating to occur and incubate for 16 h at 
30 °C. 

(6) Transfer the mixed haploid and diploid colonies from Step 5 onto an 
omniplate containing solid agar containing diploid selective medium 
(3% agar + YPD with both 100 /ig/ml nourseothricin and 250 /ig/ml 
hygromycin B) using the appropriate pintool. Incubate for 16 h at 30 °C. 

(7) Transfer diploid selected strains onto a solid noble agar solidified 
synthetic minimal media omniplate with methotrexate (4% noble agar + 
SC + 2% glucose + 200 /ig/ml methotrexate) using an appropriate pin- 
tool. Incubate at 30 °C and acquire pictures of the colony array every 96 h 
for approximately 2 weeks. 

3.2.1.2. Timeline 

Endogenous recombinant strain screen setup (steps 1—2): 8 h to 3 days 

(depending on the screen density achieved). 
Transferring haploids for mating (steps 3—6): 6 min or more per bait strain 

(depending on screen density and robotic routine efficiency) + 16 h 

incubation. 
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Diploid cell selection (step 7): 5 min or more per bait (depending on screen 
density and robotic routine efficiency) + 16 h incubation. 

DHFR PCA survival selection (step 8): 5 min or more per bait (depending 
on screen density and robotic routine efficiency) + 2 weeks incubation 
(maximum) . 



3.2.13. Anticipated results and controls To evaluate a DHFR PCA 
screen, both positive controls (known PPIs) and negative controls 
(fragments alone or non-interacting protein partners) should be tested on 
every plate. These non-interacting protein partner strain colonies exhibit 
background growth that should stop after a few days of incubation on 
methotrexate-containing plates. Colonies containing interacting baits and 
preys will continue to grow. The PCA fragment fusions expressed alone 
should not result in cell proliferation because the individual PCA frag- 
ments have no activity, thus if individual strain colonies do grow for 
unknown reasons, they should not be considered for further analysis. 
The most critical controls to do are those for spontaneous PCA; cases 
where a protein-PCA fragment fusion interacts with the complementary 
fragment alone. We found in our own screen that about 5% of bait or prey 
protein-expressing strains would grow in the presence of methotrexate 
when mated to a strain harboring an expression vector encoding the com- 
plementary fragment alone (Tarassov et ah, 2008). These complementary 
DHFR PCA fragment expression vectors are available upon request form 
our lab. Other controls can be included to test how the PCA screen per- 
forms. For instance, we have used the engineered heteromeric SspB YGMF : 
SspB LSLA interaction as a positive control to validate DHFR PCA activity 
as suggested in the troubleshooting section (Table 14.1). Another elegant 
control to examine the range of dissociation constants for which the 
DHFR PCA is sensitive is to use a complex for which single-point 
mutations are known by other methods disrupt the interaction to different 
degrees. To this end, we have used in our own work, mutants of the Ras 
binding domain of Raf (Campbell- Valois et al., 2005). A potential source 
of false positives in a PCA screen could be through trapping of nonspecific 
complexes due to irreversible folding of the DHFR fragments. However, 
we have used the adenosine S^S'-monophosphate dependent dissociation 
of the yeast protein kinase A complex as a control (Stefan et al. , 2007) to 
show that the DHFR PCA is fully reversible, and thus the trapping of 
complexes is unlikely (Tarassov et al., 2008). Another control one could 
use is a condition-dependent PPL We have used in our own work, the 
FK506-binding protein that binds to rapamycin and this complex then 
binds the target of rapamycin (TOR) (Pelletier et al., 1998). All of these 
reagents are available upon request. 



Table 14.1 Trouble shooting large-scale DHFR PCA screen 



Step 


Problem 


Possible reason 

Erroneous haploid 


Solution 

Verify protocol for appropriate culture conditions 


1-2 


Strains are not growing or 




incomplete prey array growth 


selection 








Low glycerol viability 


Strains can be streaked on solid agar-selective medium Petri 
dishes prior to inoculation to increase viability 






Technical problem 


Verify that all pins of the pintool touch glycerol stocks and 
the recipient omniplate 


7 


Low number or no colonies on 


Erroneous haploid 


Verify mating type of haploid strains 




diploid selective plates 


strains type 








Technical problem 


Pintool alignment might have changed. No modifications 
to the pintool positioning should be done between 
transfers 


8 


No colony growth on DHFR 


Erroneous selective 


Use heteromeric complex SspB Y GMF : SspB LSLA as a positive 




PCA survival-selective 


conditions 


control to validate DHFR PCA activity 




medium 










Erroneous DHFR 


Verify by a strain diagnostic PCR the complementarity of 






PCA 


PCA fragments 
Verify DHFR PCA fragment recombinant insertion by 
genomic sequencing 






DHFR PCA fragment 


Verify DHFR PCA fragment expression by western blot 






expression 






All colonies grow at the same 


Erroneous selective 


Use DHFR PCA fragment controls alone as negative 




rate on DHFR PCA survival- 


conditions 


control 




selective medium 










Methotrexate solubility 


Verify methotrexate solubility under conditions used. 
Stock solution should not exceed 10 mg/ml in DMSO 
and final concentration in solid agar plates should not 
exceed 200 mg/ml 
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3.2.2. Analysis of large-scale DHFR PCA screens 

The goal of this section is to turn the size of the colonies on the selection 
plate into binary data that will represent PPIs. First, the digital images have 
to be transformed into tables containing colony intensities. Second, these 
colony intensities have to be turned into PPI confidence scores. 

3.2.2.1. Image analysis Several bioinformatics tools are available to per- 
form colony size measurements from digital images of high colony density 
plates (Carpenter et ah, 2006; Collins et ah, 2006; Memarian et ah, 2007). 
Alternatively, tools developed for analysis of spotted DNA microarrays can 
be modified to estimate the sizes of the colonies spaced on regular grids 
(Dudley et ah, 2005). Globally, the analysis consists of measuring the 
number of pixels per colony position. In cases where high density plates 
are used (above 1536 position grid), more involved analyses methods have 
to be utilized to separate adjacent colonies that may touch each other 
(Tarassov et ah, 2008). However, because PPIs are rare, most colonies will 
have a very slow growth rate and this problem is mostly negligible at lower 
densities. Thus, when lower densities are used for the screens, simple macros 
can be implemented in publicly accessible image analysis software such as 
ImageJ (http://rsb.info.nih.gov/ij/). In this case, digital images of plates are 
first converted to 8-bit grayscale format and colonies are measured by 
positioning the measurement tool on a colony center and estimating the 
integrated pixel intensity in an area that corresponds to the maximal colony 
size allowed. The process is iterated over all the grid positions and then all 
the plates, and the grid positions and intensity values are exported to text 
files for further processing in your favorite spreadsheet or statistical analysis 
software (Example ImageJ scripts that we use are available at our web site: 
http://michnick.bcm.umontreal.ca). It is important to note that colonies 
should always have the same positions on the images. If this is not the case, 
some of the tools cited above include a step that positions the analysis grid 
onto the colony positions prior to colony size measurements. 

3.2.2.2. Statistical analysis of raw colony data: From continuous to binary 

data A PCA screen based on survival assay will only be useful if there is a 
confidence score attached to each of the putative interactions. Raw colony 
intensity data are continuously distributed, that is, they cover a wide range 
of values and cannot be directly turned into "yes" or "no" binary scores. 
Further, not all the colonies that can grow due to protein-fragment com- 
plementation will do so at exactly the same rate. As described above, every 
PCA experiment should include a set of positive controls consisting of pairs 
of baits and preys that interact with each other, and negative controls, 
consisting of pairs or baits and preys that do not interact with each other. 
These will be used for quality control in order to detect mis-positioning of 



Protein-Fragment Complementation Assays in Yeast 347 

the grid and batch effects (variation in media, incubation, drug concentra- 
tion) that affect global growth rate of the different plates. Finally, the 
positive controls can provide a first, visual analysis of the data, whereby 
the growth rate of the positive controls indicate roughly the intensity 
threshold above which we expect strains with interacting bait— prey pairs 
to grow. Beyond these "qualitative" controls, a statistical analysis should be 
used to separate the interacting pairs from the noninteracting pairs. 

The statistical analysis globally includes two steps. First, it has to be 
determined whether there is a significant difference in growth rates 
among the plates before applying a global analysis to the data. If there is 
significant variation, the data should be normalized such that all the plates 
have the same average colony size. Alternatively, data could be transformed 
into relative scores, such as Z-scores, whereby each data point is trans- 
formed to become the number of standard deviations that data point is from 
the average of the plate. We found that combining the Z-score and the raw 
intensity worked best for our large-scale screen (Tarassov et ah, 2008). Then 
continuous values must be turned into binary values by setting a threshold of 
intensity above which proteins are inferred to interact, and establishing a 
confidence score for this particular threshold. One way to assign confidence 
values to PCA interactions is to benchmark the intensity values against a set 
of data containing interactions that should be detected in the screen (a set of 
real positives) and others that should not (a set of real negatives). The real 
positives set can be derived from a set of known and well-supported 
interactions. The real negative set has however to be approximated because 
it is impossible to show that two proteins never interact. Sets of proteins that 
are most likely not interacting can be used for this purpose, for instance 
proteins that are not localized in the same cell compartments and that have 
negatively correlated expression profiles (Collins et ah, 2007). One can then 
predict, for a given intensity threshold, what should be the proportion of 
true positive interactions and false positive interactions. In order to decide 
on the threshold, the ratio of true positive interactions divided by to the 
total number of inferred positives (true positives + predicted false posi- 
tives) — known as the positive predictive value (PPV) — is calculated as a 
function of threshold of intensities. For instance, at a PPV of 95% percent, 
one expects 5% of positives to be false. Lower and higher thresholds can be 
used depending on how stringent one wants the analysis to be. It is 
important to note that the estimated PPV is only accurate if the relative 
occurrence of positives and negatives in the reference sets is similar to that of 
the real positives and negatives (Jansen and Gerstein, 2004). In the case of a 
genome-wide, comprehensive screen, this fraction corresponds to a very 
low prior probability of finding interactions among all pairwise possibilities. 
On the other hand, a small-scale screen of a specific biological process will 
contain a greater proportion of real positives than a random screen. The 
reference set therefore needs to be tailored for the actual screen being 
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performed, that is, the space of the interactome that is covered. For a formal 
treatment of these issues, refer to Jansen and Gerstein (2004). Beyond these 
statistical considerations, analysis such as Gene Ontology enrichment and 
visualization of interaction clusters should be used to further assess the 
confidence in the data set being produced. For instance, the matrix of 
binary interactions can be clustered to identify groups or complexes of 
interacting proteins. Finally, sets of true positives and negatives are not a 
panacea and the functional and evolutionary characterization of PPIs is the 
only way to provide a definitive answer as to whether an interaction is 
functionally relevant or not for the cell (Levy et ah, 2009). 




4. A Life and Death Selection PCA Based on 
the Prodrug-Converting Cytosine 
Deaminase for Dissection of PPIs 

In this section, we present a PCA based on an optimized mutant form 
of the reporter enzyme yeast cytosine deaminase (OyCD). The choice 
of yCD as a reporter was based on its role in a pyrimidine salvage pathway 
and the availability of a prodrug 5-fluorocytosine (5-FC), which is con- 
verted to 5-fluorouracil (5-FU) by yCD. Bacteria and yeast can convert 
cytosine to uracil and use it for the synthesis of UTP and TTP, which 
are required for cell survival (Kurtz et ah, 1999). In S. cerevisiae, yCD is 
encoded by the FCY1 gene and is the enzyme that catalyzes this reaction. In 
addition to deaminating cytosine, yCD can also deaminates 5-FC to 5-FU. 
5-FU will be further processed by enzymes of the pyrimidine salvage 
pathway to 5-FUTP, a toxic compound that causes cell death. These 
particular properties of yCD make it an ideal reporter for a life and death 
selection PCA (Fig. 14.4A) (Ear and Michnick, 2009). 

The OyCD PCA allows death and survival assay to be performed without 
changing the reporter system. In a two-step selection process, we can engi- 
neer mutant forms of a protein in order to dissect its different functions; 
disrupting interactions with one partner, while retaining interaction with 
others (Fig. 14.4B). For example, protein A interacts with both protein B and 
protein C. First, we can screen for mutant forms of protein A that disrupt 
interaction with protein B. Second, we select for protein A mutants that still 
interact with protein C. Using OyCD PCA, neither of these selection steps 
requires replica plating. In addition, no expensive reagent or equipment is 
required. Specific mutants can be obtained in about 4 weeks. 

Both the survival and death selection assays are performed in fcyl 
deletion strains. For the survival selection assay, uracil must be removed 
from the selection medium. Only cells that have OyCD PCA activity will 
be able to synthesize uracil and survive. For the death selection assay, cells 
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Figure 14.4 (A) A dual selection PCA. The OyCD PCA can serve as a reporter for 
formation of a protein-protein interaction provided that the reconstituted reporter 
enzyme supports growth under one condition (survival assay) or no growth under 
another condition (death assay). In the case where the two test proteins do not interact, 
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are grown in a selection medium in the presence of 5-FC. In this death 
assay, cells that have OyCD PCA activity will be sensitive to 5-FC. 

4.1. Preparation for a two-step OyCD PCA screen 

The proteins of interest are fused to the N-terminal of OyCD fragment 1 
or fragment 2 (protein A-OyCD-F[l], protein B-OyCD-F[2] and protein 
C-OyCD-F[2]). For some proteins, PPIs can only be detected when they 
are fused at the C-terminal of OyCD fragment 1 (e.g., OyCD-F[l]-protein 
A). OyCD-F[l] corresponds to amino acid residues 1—77 of yCD with an 
A23L point mutation. OyCD-F[2] corresponds to amino acid residues 
57-158 of yCD with the following point mutations: V108I, I140L, T95S, 
and K117E (Ear and Michnick, 2009). The proteins of interest and OyCD 
fragments are separated by a 15-amino acid flexible polypeptide linker (Gly. 
Gly.Gly.Gly.Ser) 3 . The nucleotide sequences encoding these fusion pro- 
teins are cloned into yeast expression vectors. We used the p413Gall and 
p4 15 Gall expression plasmids (Mumberg et ah, 1995). Before proceeding 
with a screen, verify that interactions of your proteins of interest are 
detected by OyCD PCA. Titrate the amount of cytosine and 5-FC required 
for detecting your interactions. We normally try a range between 50 and 
1000 /ig/ml of cytosine or 5-FC. For many proteins, 100 /ig/ml of either 
substrate is sufficient for detecting OyCD PCA activity. Use well known 
interacting and non-interacting proteins as controls. For the two-step 
OyCD PCA screen, a library of your gene of interest can be generated by 
methods such as error-prone PCR. For example, if the goal is to engineer 
mutant forms of protein A that can specifically disrupt interaction with 
protein B while preserving interaction with protein C, an ideal library of 
protein A carrying 1—3 mutations is desired. 

4.2. Materials 
4.2.1. Reagents 

• BY4741, BY4742, or BY4743 strains with a deletion in the FCY1 gene 
(fcylA) (Giaever et ah, 2002) that are resistant to G418. 



the reverse scenarios are observed. (B) Screen for mutants of protein A that do not bind 
to protein B but retain binding to protein C using sequential death followed by survival 
selection OyCD PCA. The first death selection screen consists of screening the library 
of protein A mutants fused to OyCD-F[l] (A*-F[l]) with protein B fused to OyCD-F[2] 
(B-F[2]) and screen for clones that show loss of OyCD PCA activity (growth in the 
presence of 5-FC). The second survival selection step consists of screening A*-F[l] 
clones harvested from the first death selection screen against protein C fused to 
OyCD-F[2] (C-F[2]) for clones that show OyCD PCA activity using the life assay 
(growth in presence of cytosine). 



Protein-Fragment Complementation Assays in Yeast 351 

• SC medium with the appropriate amino acid drop out according to the 
chosen expression plasmids. 

• Your genes of interest fused to the OyCD fragments in yeast expression 
vectors. 

• Sorbitol buffer (1 M sorbitol, 1 mM EDTA, 10 mM Tris (pH 8.0), 
100 mM lithium acetate) 

• PLATE solution (40% PEG 3350, 100 mM lithium acetate, 10 mMTris 
(pH 7.5), and 0.4 mMEDTA) 

• Dimethylsulfoxide (Fisher) 

• Sterile distilled water 

• G418 (Wisent) 

• Cytosine (Sigma) 

• 5-Fluorocytosine (Sigma) 

• Agar (Bioshop) 

• Noble agar (Bioshop) 

• DH5a or MCI 061 E. coli electrocompetent cells 

• LB medium 

• DNeasy Tissue Kit (Qiagen) 

4.2.2. Facultative 

• Antibodies against yCD fragments: anti-yCD polyclonal (Biogenesis). 



4.2.3. Equipment 

• Genepulser II electroporator system (Bio-Rad) or Electroporator 2510 
(Eppendorf) 

• Electrop oration cuvette with 1 mm wide slot (Sigma) 

• Glass spreader 

• 100 mm Petri dishes 

• Shaking incubators, preset to 30 and 37 °C 

• Incubator, preset to 30 and 37 °C 



4.2.4. Experiment preparation 

10 mg/ml stock solution of cytosine: Dissolve 100 mg of cytosine in 10 ml of 
distilled water. Vortex the solution and incubate at 37 °C to make it 
dissolve. Filter the solution and store at room temperature. It is better to 
make this solution fresh and use it within a week. 

10 mg/ml stock solution of 5-FC: Dissolve 100 mg of 5-FC in 10 ml of 
distilled water. Vortex the solution and incubate at 37 °C to make it 
dissolve. Filter the solution and use it right away or aliquot in sterile tubes 
and store at — 20 °C. 



352 Stephen W. Michnick et al. 

Control plates'. Make SC plates for selection of clones harboring the expres- 
sion plasmids. We used the p413Gall and p415Gall expression vectors, 
therefore, our control plates contain SC medium without histidine and 
leucine, with 2% agar, 2% raffinose, and 2% galactose. 

Cytosine survival selection plate: Make SC plates without uracil and selection for 
the expression plasmids. We used the p413Gall and p415Gall expression 
vectors, therefore our selection plates contain SC medium without uracil, 
histidine, and leucine, with 3% noble agar, 2% raffinose, 2% galactose, and 
cytosine (we use 100 /ig/ml of 5-FC for our proteins of interest). 

5-FC death selection plates: Make SC plates with 5-FC and selection for the 
expression plasmids. We used the p413Gall and p415Gall expression 
vectors, therefore our selection plates contain SC medium without 
histidine and leucine, with 2% noble agar, 2% raffinose, 2% galactose, 
and 5-FC (we use 100 /ig/ml of 5-FC for our proteins of interest). 

4.3. Procedure 

4.3.1. Death selection screen 

(1) Transform (Knop et ah, 1999) 1 fig of the library encoding mutant 
forms of protein A (protein A*) in BY47 '41 fey 1 A strain that already 
carry a plasmid expressing protein B (Fig. 14.4B). 

Critical step: Make sure that the efficiency of the transformation gives 
enough colonies to cover six times the size of the library in order to 
have good coverage of potential mutants. For example, if the size of 
the library of protein A* is 5000 clones, make sure to obtain more 
than 30,000 clones. 

(2) Plate half of the transformation on the control plates to select for the 
presence of both expression plasmids (p413Gall-gene A*-OyCD- 
F[l] + p415Gall-gene B-OyCD-F[2]). These plates serve as controls 
for reporting the efficiency of the transformation. Plate the other half 
of the transformation on 5-FC death selection plates. 

Critical step: Test the efficiency of your competent yeast cells to deter- 
mine how many cells to plate per 100 mm Petri dish. Do not plate 
more than 5000 cells per 100 mm Petri dish. 

Pause point: Make glycerol stock of the pooled yeast colonies obtained 
on the control plates as a backup source or for future screens if 
required. 

(3) Incubate plates at 30 °C for 2—3 days. Compare the number of colonies 
obtained on the 5-FC death selection plates to the control plates. 
Critical step: We should expect 10—50% less colonies on the 5-FC 

death selection plates in comparison to the control plates. This 
variability depends on the pair of interaction chosen and the 
number of mutations per clone in the library. 
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(4) Colonies that grow on 5-FC selection plates are pooled and harvested 
for DNA extraction (Qiagen DNeasy Tissue Kit or a genomic DNA 
purification protocol using phenol-choroform) in order to recover the 
plasmids that express protein A*. 

Pause point: Yeast cell pellet can be store at —20 °C for months. 

(5) Digest the extracted DNA with enzyme (s) that cut in the plasmids 
expressing protein B-OyCD-F[2] but not the plasmids expressing 
protein A*-OyCD-F[l] library. We use Aflll, Bspml, Hpal, Muni, 
Narl, or Xcml since they cut in p415Gall and not in p413Gall 
plasmid or the gene of interest. This step is not required if the two 
expression plasmids do not have the same antibiotic resistance gene. 

(6) Use 2 jA of extracted DNA for electroporation into electrocompetent 
E. coli cells. We use the MCI 061 E. coli strain since it has higher 
transformation efficiency than the DH5a strain. Plate the E. coli on LB 
plates with appropriate antibiotic selection. We use LB with ampicil- 
lin for the p41XGall plasmids. 

(7) Pool E. coli colonies and extract the plasmid DNA using your mini- 
prep kit of choice. 

Pause point E. coli cell pellet can be store at —20 °C for months. 

4.3.2. Survival selection screen 

(8) Transform according to Knop et ah (1999) the library encoding for 
mutant forms of protein A (protein A*) retrieved after the death 
selection screen in BY47 '41 fey 1 A strain that already carry a plasmid 
expressing protein C (Fig. 14. 4B). 

Critical step: Make sure that the efficiency of the transformation gives 
enough colonies to cover six times the size of the library in order 
to have a good coverage of potential mutant clones. 

(9) Plate half of the transformation on the control plates to select for the 
presence of both expression plasmids (p41 3 Gall -gene A*-OyCD- 
F[l] + p415Gall-gene C-OyCD-F[2]). These plates serve as con- 
trol for reporting the efficiency of the transformation. Plate the other 
half of the transformation on cytosine survival-selection plates. 
Critical step: Test the efficiency of your competent yeast cells to have 

an idea how much cells to plate per 100 mm Petri dish. Do not 
plate more than 2000 cells per 100 mm Petri dish. 

Pause point Make glycerol stock of the pooled yeast colonies 
obtained on the control plates as a backup source or for future 
screens if required. 
(10) Incubate plates at 30 °C for 3—7 days. 

Critical step: We can expect to obtain from a few to hundreds of 
colonies at this step. This variability depends mostly on the pair of 
interaction that was chosen and the complexity of the library. 
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(11) If the screen resulted in less than 50 colonies, inoculate each yeast colony 
separately in 5 ml of selection medium and harvest cells for DNA 
extraction (Qiagen DNeasy Tissue Kit or a genomic DNA purification 
protocol using phenol-choroform). If over 50 colonies were obtained, 
pooled all the colonies and extract DNA from the pooled cells. 
Pause point Make glycerol stock of the single or pooled yeast colonies 

as a backup source. 

(12) Digest the extracted DNA with enzyme (s) that cut in the plasmid 
expressing protein C-OyCD-F[2] but not the protein A*-OyCD-F[l] 
library. We use Aflll, Bspml, Hpal, Muni, Narl, or Xcml since they 
cut in p415Gall and not in p413Gall plasmid or the gene of interest. 
This step is not required if the two expression plasmids do not have 
the same antibiotic resistance gene. 

(13) Use 2 jA of extracted DNA for electroporation into electrocompe- 
tent MCI 061 E. coli cells. Plate the E. coli on LB plates with the 
appropriate antibiotic selection. 

(14) For samples obtained from a single yeast colony in step 11, inoculate 
one or two E. coli colonies for plasmid DNA extraction. For samples 
obtained from pooled yeast colonies in step 11, inoculate over 90 
E. coli colonies for plasmid DNA extraction. 

Pause point Make glycerol stock of the single or pooled bacterial 
colonies as a backup source. 

(15) Digest the isolated plasmids with appropriate restriction enzymes 
or perform diagnostic PCR to confirm the presence of gene 
A*-OyCD-F[l]. 

(16) Retransform individually the purified plasmids expressing protein A 
mutants in BY47 41 fey 1 A strain carrying a plasmid expressing protein 
B and C, respectively, and test for OyCD PCA activity. 

(17) Send the purified plasmids expressing protein A mutants for 
sequencing in order to identify the mutation(s). 

4.3.3. Timeline 

5-FC death selection (steps 1—3): 2—3 days. 

Cytosine survival selection (steps 8—10): 3—7 days. 

Isolation of DNA from yeast (steps 4 and 11): almost 1 day. 

Further characterization of the individual clones (steps 14—17): several days 

depending on the number of clones obtained. 
Troubleshooting advice can be found in Table 14.2. 



4.3.4. Additional information 

We have also generated destination vectors carrying the OyCD fragments 
that are compatible with the Gateway cloning system. With these plasmids, 
we can take advantage of the existing Gateway expression clones 



Table 14.2 Troubleshooting an OyCD PCA screen 



Step 

3 


Problem 

Less than 10% of colonies died on 


Possible reason 


Solution 


Too many yeast plated on the 


Plate less than 1000 cells per 




the 5-FC selection plates 


selection plate 


100 mm Petri dish 
Increase 5-FC concentration 


6 and 13 


No E. coli colonies or very few 


Electrocompetent E. coli cells not 


Use freshly prepare 




colonies 


very competent 


electrocompetent MCI 061 
E. coli cells 


10 


Several hundreds of colonies grew 


Too many yeast plated on the 


Plate less than 1000 cells per 




on the cytosine selection plates 


selection plate 


100 mm Petri dish 




Small colonies form around the 


Uracil can diffuse out of cells that 


Decrease cytosine 




initial large colony after 4 days of 


have OyCD PCA activity and 


concentration 




incubation 


allow for cells that do not have 


Pick only the large colony at 






OyCD PCA activity to grow 


the center 
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(distributed by Open Biosystems) to facilitate the process of generating the 
fusion between the genes of interest and OyCD fragments. 




5. Visualizing the Localization of PPIs with 
GFP Family Fluorescent Protein PCAs 

The first fluorescent protein PCA was described by Lynne Regan's 
group for GFP (Ghosh et al, 2000; Magliery et al, 2005; Wilson et al, 2004) 
and we and others have described different color and behavioral variants 
(Cabantous et al, 2005; Hu et al, 2002; Macdonald et al, 2006; Nyfeler 
et al, 2005; Remy and Michnick, 2004; Remy et al, 2004). PCAs based on 
fluorescent proteins have both unique features, but also the most caveats to 
their application. Notably, and unlike other PCAs, those based on these 
fluorescent proteins are irreversible, which can be both useful (trapping and 
visualizing rare and transient complexes) but also require care in interpreta- 
tion of turnover or localization of interacting proteins (Hu et al, 2002; 
Magliery et al, 2005). It is important that the kinetics of relocalization of 
protein interactions observed with fluorescence PCAs be confirmed by 
immunofluorescence or by monitoring the localization of the same proteins 
fused to full-length fluorescent proteins. Fluorescent protein PCAs are also 
limited to the temporal range of dynamics that can be studied. Because 
different variants of these proteins take minutes to hours to fold and mature, 
they are obviously not appropriate for studying most dynamic processes in a 
quantitative way, though many important slower processes can be studied. 
PCAs based on luciferase enzyme reporters are, like the DHFR PCA, fully 
reversible and can be used to capture kinetics on the second time scale (see 
Section 6) (Remy and Michnick, 2006; Stefan et al, 2007). As we previ- 
ously demonstrated, PPIs that occur within a specific biochemical pathway 
can be modulated in predicted ways by conditions or molecules that activate 
or inhibit the pathway. We and others have shown that at least changes in 
the formation of complexes can be detected with the GFP and YFP PCAs 
(Remy and Michnick, 2004). Further, the subcellular location of stable 
complexes and changes in their locations following perturbation can also 
be detected in intact living cells with the YFP PCA (Macdonald et al, 2006; 
Remy and Michnick, 2004; Remy et al, 2004). It is this ability to detect the 
location and intracellular movements of protein complexes that make 
fluorescent protein-based PCAs unique. Because GFP/YFP-based PCAs 
do not require additional substrates or cofactors for emission of fluores- 
cence, they are particularly simple to implement. We have shown that PPIs 
can be monitored by fluorescence microscopy, flow cytometry, and spec- 
troscopy using GFP- and YFP-based PCAs (Macdonald et al, 2006; Remy 
and Michnick, 2004; Remy et al, 2004). We have applied these assays to the 
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detection and quantification of protein interactions, localization of com- 
plexes in living cells, and cDNA library screening in mammalian cells 
(Benton et ah, 2006; Ding et ah, 2006; Macdonald et ah, 2006; Nyfeler 
et ah, 2008; Remy and Michnick, 2004; Remy et ah, 2004). In addition, we 
have used the YFP-based PCA to detect protein interactions in specific 
subcellular compartments of S. cerevisiae, such as cytoplasm, nucleus, plasma 
membrane and the bud neck (Fig. 14.5) (Manderson et ah, 2008). In the 
following protocol we describe methods for studying PPI with the "Venus" 
mutant of YFP (Nagai et ah, 2002). 



Gpal-YFP 



Pheromone 



Membrane 




Fus3p-Tecl 
(PCA) 



Figure 14.5 Venus YFP PCA allows for detection of precise location of protein 
complexes within living cells. Illustration for visualization of protein complexes 
in different regions within cells using yeast pheromone response mitogen activated 
protein kinase pathway as an example. Images show the location of interactions of 
Fus3p with Gpal (Metodiev et ah, 2002) to the membrane, with Stell (Choi et ah, 
1994) to the cytoplasm and with Tecl (Chou et ah, 2004) to the nucleus. As controls for 
different localizations, Gpal fused to full-length Venus YFP protein is shown to be at 
the membrane while Fus3-Venus YFP is found in both cytoplasm and the nucleus. Cells 
containing Fus3-venus YFP were treated with 1 jiM alpha-factor pheromone for 2-3 h 
to induce its translocation to membrane and nucleus. 
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5.1. Materials 
5.1.1. Reagents 

• Competent MATa or diploid yeast prepared according to Knop et al. 
(1999). 

• SD medium (6.7 g/1 yeast nitrogen base, without amino acids) 

• SD agar (SD medium with 2% agar) 

• 10 X amino acid mix -his, -leu, -lys: 



Adenine sulfate 0.4 g/ml 

Uracil 0.2 g/ml 

L-Tryptophan 0.4 g/ml 

L-Arginine HC1 0.2 g/ml 

L-Tyrosine 0.3 g/ml 

L-Phenylalanine 0.5 g/ml 

L-Glutamic acid 1.0 g/ml 

L-Asparagine 1.0 g/ml 

l- Valine 1.5 g/ml 

L-Threonine 2.0 g/ml 

L-Serine 3.75 g/ml 

Methionine 0.2 g/ml (do not include 

when growing diploid yeast) 



10 x low fluorescence yeast nitrogen base without riboflavin and folic 

acid (Sheff and Thorn, 2004) 

20% glucose solution 

PLATE solution (40% polyethylene glycol 3350, 100 mM LiOAc, 

10 mMTris (pH 7.5), 0.4 mMEDTA) 

DMSO 

poly-L-lysine mol. wt. 30,000—70,000 (Sigma P2636) or concanavalin 

A (Sigma L7647) 



5.1.2. Equipment 

• Fluorescence microscope (Nikon Eclipse TE2000U inverted microscope 
with a CoolSnap HQ Monochrome CCD camera (Photometries)) 

• 96-well black, glass-bottom plate (Molecular Machines) 

• 6-well culture plate or Petri dish 

• Appropriate sterile tubes to grow yeast 

• Spectrophotometer spectra MAX GEMINI XS (Molecular Devices) 
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5.1.3. Plasmids 

The proteins to test for interaction are fused to the N- and C-terminal 
fragments of an enhanced YFP (Venus YFP; Nagai et ah, 2002), in 5 f or 3' 
of the fragments (protein A-vYFP-F[l], vYFP-F[l] -protein A, protein 
B-vYFP-F[l], vYFP-F[l] -protein B). vYFP-F[l] (N-terminal) corresponds 
to amino acids 1—158, and vYFP-F[2] (C-terminal) corresponds to amino 
acids 159—239 of Venus YFP. The fusions are subcloned into yeast expression 
vectors p413ADH for the vYFP-F[l] fusion and p415ADH for the vYFP-F 
[2] fusion (Mumberg et ah, 1995). We typically insert a 10-amino acid flexible 
polypeptide linker consisting of (Gly.Gly.Gly.Gly.Ser) 2 between the protein 
of interest and the vYFP fragments (Table 14.3). In yeast fragments can also 
be fused to the genes of interest at their chromosomal loci using a homolo- 
gous recombination method (Ghaemmaghami et al., 2003). For this purpose 
the PCA fragments are cloned into nonexpression vectors that provide a 
selection marker (e.g., antibiotic resistance). 



5.2. Procedure 

5.2.1. Cotransformation of competent yeast 

(1) Thaw competent yeast cells on ice. 

(2) Mix 10 jA of cells with 1 /A (~250 ng) of each yeast expression plasmid 
(e.g., p413ADH and p415ADH, Mumberg et al., 1995) encoding the 
Venus YFP PCA fusion partners (protein A fused to vYFP-F[l] and 
protein B fused to vYFP-F[2]), 60 [A of PLATE solution and 8 /A 
DMSO. 

(3) Heat shock yeast at 42 °C for 20 min. 



Table 14.3 


Troubleshooting vYFP PCA experiments 




Step 


Problem 


Possible reason 


Solution 


5.2.1. (6) 


No colonies after 


DNA or cells used 


Increase quantity of cells 




transformation 


is less 


and DNA. Increase 
the volume of cells 
plated on the Petri 
dish or six-well plate 




Too many 


Plated lot of cells 


Dilute cells before 




colonies after 




plating on the Petri 




transformation 




dish or six-well plate 


5.2.2. (3) 


Fusion protein is 


Fragment fusion 


Fuse the PCA fragment 




not functioning 


interferes with 


to the other end of the 




correctly 


protein 

expression/ 

function/stability 


protein 
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Critical step: Shorter or longer incubation times at higher or lower 
temperatures can result in decreased efficiency of transformation. 

(4) Centrifuge at 2000 rpm for 3 min. Remove supernatant and resuspend 
cells in 500 fA SD medium without amino acids or glucose. 

(5) Plate 20 fA of cell suspension per well on SC agar (SD agar + 2% 
glucose + lx amino acids (-his, -leu, -lys for MATa; -his, -leu, -lys, 
-met for diploids)) in a 6- well plate. 

(6) Incubate at 30 °C for 48-72h. 

5.2.2. Preparation of cells for fluorescence microscopy 

(1) Inoculate a fresh colony for each sample into 3 ml of SC medium 
(SD medium + 2% glucose + 1 X amino acids (—his, -leu, -lys for 
MATa; -his, -leu, -lys, -met for diploids)) and grow overnight at 
30 °C with shaking. 

(2) The following day, measure the OD 600 of the overnight culture and 
inoculate a fresh culture of LFM (1 X low fluorescence yeast nitrogen 
base + 2% glucose + 1 X amino acids (-his, -leu, -lys for Mat A; -his, 
-leu, -lys, -met for diploids)) with enough cells to obtain an OD 600 of 
approximately 0.1—0.3 at the time of analysis. 

Critical step: It is particularly important for the cells to be in the log phase 
of growth in order to avoid including dead and unhealthy cells. 
These cells are highly autofluorescent and thus would confound 
quantitative analysis. Cells in the lag phase can be used if they are 
appropriate to study a particular interaction (s) as long as the condi- 
tion of the cells is verified by bright field microscopy. 

(3) Coat the wells of a glass bottom 96-well plate (Molecular Machines) with 
a solution of 1 mg/ml poly-L-lysine, or 50 /ig/ml concanavalin A for 
10 min, rinse with distilled water and allow to dry. Transfer 70 fA of cell 
suspension to each well. Wait 10 min to allow the cells to settle in the 
wells. Acquire images with a fluorescence microscope equipped with a 
CCD camera, using a YFP filter cube and ~750 ms of exposure time. 
Critical step: It is best to use a 60 X or 100 X objective to discriminate 

subcellular structures. Bright field or phase contrast images can be 
acquired for each field of view to compare the morphology of the 
yeast with fluorescent PCA signal. Specific functional assays to 
further characterize a PPI might be performed here. 



5.2.3. Timeline 

Cotransformation of competent yeast (steps 1—5): 30—45 min (depending on 

the number of samples) plus 48—72 h for cell growth (step 6). 
Preparation of cells for fluorometric analysis (steps 1 and 2): 24 h. 
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Fluorescence microscopy (step 3): 30 min to hours, depending on the 

number of samples. 
Microplate reader analysis (step 3): a few minutes or more, depending on 

the number of samples. 

5.2.4. Anticipated results 

The fluorescence intensity of the reassembled Venus YFP PCA varies with 
the expression levels and the interaction dissociation constants for the 
protein pairs attached to the PCA fragments. In the case of our simplest 
positive control (GCN4 leucine zipper pair fused to the PCA fragments: 
Zip-vYFP-F[l] + Zip-vYFP-F[2]), the reconstituted PCAs represent 
approximately 10—20% of the activity of the full-length Venus YFP. The 
PCA fusions expressed alone should not result in detectable fluorescence 
(compared to nontransformed cells) because the individual PCA fragments 
have no activity. For each study, positive (known interaction) and particu- 
larly negative (noninteracting proteins) controls should always be performed 
in parallel. A PCA response should not be observed if non-interacting 
proteins are used as PCA partners. 




6. Studying Dynamics of PPIs with Luciferase 
Reporter PCAs 

It has been a major challenge to measure and quantify the dynamics of 
protein complexes in their native state within living cells. Here, we describe 
protocols for implementing two luciferase enzyme based PCAs; Renilla 
luciferase (Rluc) and Gaussia luciferase (Glue) that are designed specifically 
to investigate the dynamics of assembly and disassembly of protein com- 
plexes. We have applied these assays to the detection and quantification of 
protein interactions in mammalian cells as well as yeast. These assays are 
sensitive enough to detect interactions among proteins expressed at endog- 
enous levels in vivo and to study dynamic changes in both the formation and 
disruption of PPIs over seconds without altering the kinetics of binding 
(Remy and Michnick, 2006; Stefan et ah, 2007). Both of these luciferases 
catalyze the oxidation of substrate coelenterate luciferins (coelenterazines) 
in a reaction that emits blue light (at a peak of 480 nm) and requires no 
cofactors (Tannous et ah, 2005). The substrates readily diffuse through cell 
membranes and into all cellular compartments, enabling quantitative analy- 
sis in live cells. Rluc and Glue are monomeric proteins of 312 (36 kDa) and 
185 amino acids (19.9 kDa). Glue PCA has some advantage in that the 
reporter protein is smaller and has 10 times higher activity to native 
coelanterizine than Rluc. However, at present, Rluc has the advantage 
that stable substrates (e.g., benzyl-coelenterizine) can be used with this 
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reporter allowing for easier handling and integration of signal over longer 
times. In contrast to fluorescent protein-based PCAs, both Rluc and Glue 
are fully reversible; a prerequisite to study signaling events by the dynamics 
of protein complex assembly and disassembly (Remy and Michnick, 2006; 
Stefan et ah, 2007). Both Rluc and Glue PCAs provide for extremely high 
signal-to-background ratio due to lack of any cellular luminescence and can 
easily be measured spectroscopically on whole cell populations or by imag- 
ing single cells. Finally, the luciferase PCAs allow for accurate measure- 
ments of time- (for time constants greater than 10 s) and dose-dependence 
of pharmacologically induced alterations of protein complexes. 



6.1. Materials 
6.1.1. Reagents 

• cDNAs encoding the Rluc and Glue PCA fusion partners in suitable 
expression vectors 

• Coelenterazine and benzyl-coelenterazine (Nanolight) 

• Competent MATa or diploid yeast prepared according to Knop et al. 
(1999). 

• SD medium (6.7 g/1 yeast nitrogen base, without amino acids) 

• SD agar (SD medium with 2% agar) 

• 10 X amino acid mix -his, -leu, -lys: 

Adenine sulfate 0.4 g/ml 

Uracil 0.2 g/ml 

L-Tryptophan 0.4 g/ml 

L-Arginine HC1 0.2 g/ml 

L-Tyrosine 0.3 g/ml 

L-Phenylalanine 0.5 g/ml 

L-Glutamic acid 1.0 g/ml 

L-Asparagine 1.0 g/ml 

l- Valine 1.5 g/ml 

L-Threonine 2.0 g/ml 

L-Serine 3.75 g/ml 

Methionine 0.2 g/ml (do not include when 

growing diploid yeast) 

• 10 X low fluorescence yeast nitrogen base without riboflavin and folic 
acid (Sheff and Thorn, 2004) 

• 20% glucose solution 

• PLATE solution (40% polyethylene glycol 3350, 100 mM LiOAc, 
10 mMTris (pH 7.5), 0.4 mMEDTA) 
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• DMSO 

• poly-L-lysine mol. wt. 30,000—70,000 (Sigma P2636) or concanavalin A 
(Sigma L7647) 



6.1.2. Equipment 
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• Luminescence microplate reader (LMax II Luminometer, Molecular 
Devices) 

• Luminescence microscope (Nikon Eclipse TE2000U inverted micro- 
scope with a CoolSnap HQ Monochrome CCD camera (Photometries)) 

• 96- well white plates (Molecular Machines) 

• 6-well culture plate or Petri dish 

• Appropriate sterile tubes to grow yeast 

• Spectrophotometer 

6.1.3. Plasmids 

The proteins to test for interaction are fused to the coding sequences for 
N- and C-terminal fragments of Rluc or Glue, in 5 ; or 3 7 of the fragments 
(e.g., protein A-Rluc-F[l], Rluc-F[l] -protein A, protein B-Rluc-F[2], 
Rluc-F [2] -protein B). Rluc-F[l] (N-terminal) corresponds to amino acids 
1—110, and Rluc-F[2] (C-terminal) corresponds to amino acids 111—312 of 
Rluc (Stefan et ah, 2007). Similarly Gluc-F[l] corresponds to amino acids 
1—63 and Gluc-F[l] to amino acids 64—185 of Glue (Remy and Michnick, 
2006). The fusions are subcloned into yeast expression vectors, for example, 
p413ADH for the Rluc-F[l] or Gluc-F[l] fusion and p415ADH for the 
Rluc-F [2] or Gluc-F[2] fusion (Mumberg et ah, 1995). In yeast fragments 
can also be fused to the genes of interest at their chromosomal loci using a 
homologous recombination method (Ghaemmaghami et ah, 2003). For this 
purpose the PCA fragments are cloned into nonexpression vectors that 
provide a selection marker (e.g., antibiotic resistance). For example, 
pAG25-Rluc-F[l] and pAG32-Rluc-F[2] plasmids are used for the Rluc 
PCA fragment fusions. 

6.2. Procedure 

6.2.1. Cotransformation of competent yeast 

(1) Thaw competent yeast cells on ice. 

(2) Mix 10 jA of cells with 1 /A (~250 ng) of each yeast expression plasmid 
(e.g., p413ADH and p415ADH, Mumberg et ah, 1995) encoding the 
Rluc or Glue PCA fusion partners (protein A fused to Rluc-F [1] or 
Gluc-F[l] and protein B fused to Rluc-F [2] or Gluc-F[2], 60 jA of 
PLATE solution and 8 /A DMSO). 

(3) Heat shock yeast at 42 °C for 20 min. 
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Critical step: Shorter or longer incubation times at higher or lower 
temperatures can result in decreased efficiency of transformation. 

(4) Centrifuge at 2000 rpm for 3 min. Remove supernatant and resuspend 
cells in 500 fA SD medium without amino acids or glucose. 

(5) Plate 20 fA of cell suspension per well on SC agar (SD agar + 2% 
glucose + lx amino acids (-his, -leu, -lys for MATa; -his, -leu, -lys, 
-met for diploids)) in a 6- well plate. 

(6) Incubate at 30 °C for 48-72 h. 

6.2.2. Fusion of PCA fragments at the chromosomal loci 

(1) PCR amplify the Rluc or Glue PCA fragment cassettes (containing the 
PCA fragment followed by a terminator that is followed by an antibi- 
otic selection marker; available upon request) 

(2) Transform the PCR product in to suitable competent cells: mix 10 fA of 
thawed competent cells with 10 fA (~1— 2 fig) of each PCR amplified 
cassette DNA encoding the PJuc or Glue PCA fragments along with a 
resistance marker, add 85 fA of PLATE solution; incubate for 30 min at 
room temperature; add 9.5 fA DMSO followed by heat shock at 42 °C for 
20 min; centrifuge at 2000 rpm for 3 min, remove supernatant and 
resuspend cells in 500 fA YPD medium and incubate at 30 ° C with shaking 
for 4 h; centrifuge the cells, remove supernatant and resuspend cells in 
200 fA of YPD; plate 60 fA per well in 6-well plate or entire 200 fA on Petri 
dish that contain the suitable antibiotic; incubate the plates at 30 °C for 48— 
72h; the colonies can be further verified by colony PCR methods. 

6.2.3. Preparation of cells for bioluminescence assay 

(1) Inoculate a fresh colony for each sample with plasmids into 3 ml of SC 
medium (SD medium + 2% glucose + 1 X amino acids (-his, -leu, -lys 
for MATa; -his, -leu, -lys, -met for diploids)) and grow overnight at 
30 °C with shaking. For cells with fragments fused at chromosomes, 
grow them in SC medium with suitable antibiotic. 

(2) The following day, measure the OD 600 of the overnight culture and 
inoculate a fresh culture of LFM (1 X low fluorescence yeast nitrogen 
base + 2% glucose + 1 X amino acids (-his, -leu, -lys for Mat A; -his, 
-leu, -lys, -met for diploids)) or LFM complete with suitable antibiotics 
with enough cells to obtain an OD 600 of approximately 0.1—0.3 at the 
time of analysis. 

Critical step: It is particularly important for the cells to be in the log phase 
of growth in order to avoid including dead and unhealthy cells. 

(3) Transfer 160—180 fA of cell suspension (cells equivalent to 0.1— 
0.3 OD 600 ) to each well. Manually add or inject 20—40 fA of suitable 
substrate using the Luminometer injector and initiate the 



Table 14.4 Troubleshooting Rluc or Glue luciferase PCAs 



Step 




Problem 


Possible reason 


Solution 


6.2.1. 


(6) 


No colonies after transformation 


Not enough DNA or cells 


Increase quantity of cells and DNA. 
Increase the number of cells plated 
on the Petri dish or six-well plate 






Too many colonies after transformation 


Too many cells plated 


Dilute cells before plating on the Petri 
dish or six- well plate 


6.2.2. 


(2) 


Fusion protein is not functioning 


Fragment fusion interferes with 


Fuse the PCA fragment to the other end 






correctly 


protein expression/function/stability 


of the protein 


6.2.3. 


(3) 


Poor Luminescence signal of 
Luminescence assay 


Signal integration time is too short 

Not enough substrate 
Not enough cells used 


Optimize the signal 
Integration times 

Increase the substrate concentration 
Increase the number of cells used per 
assay 






No or low signal modulation 


Number of cells and signal integration 


Optimize the number of cells and signal 






after Stimulus or Inhibitor treatment 


times are not optimal 
Stimulus or Inhibitor concentration are 

too low or duration of treatment is 

not optimal 
Signal detection time is not optimal 

Signal-to-background ratio is low 


integration times 

Try different stimulus or inhibitor 
treatment times and or 
concentrations 

Peak signal occurs immediately after 
addition of colelantrezines. Try 
optimizing the beginning of signal 
integration after substrate addition 

If the signal is very low, find an optimal 
way to extract the meaningful signal 
from background. Test appropriate 
positive and negative controls for the 
interaction you are studying 
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bioluminescence analysis. Optimize the signal integration times depend- 
ing on the bioluminescence signal strength. For real time kinetics experi- 
ments, add or inject the substrate, immediately initiate the 
bioluminescence readings with the optimized signal integration time 
continuously for the desired period. Then, background correct the 
bioluminescence signals to obtain meaningful signal. Afterwards, nor- 
malize the data to total protein concentration in cell lysates if desired 
(Bio-Rad protein assay). 

Critical step: Specific functional assays to further characterize a PPI might 
be performed here. For example, incubation of cells with agents, such 
as specific enzyme or transport inhibitors, can be performed 
for various amount of time, prior to the Luminometric analysis. 
Troubleshooting advice can be found in Table 14.4. 

6.2.4. Timeline 

Cotransformation of competent yeast (steps 1—5): 30—45 min (depending on 
the number of samples) plus 48—72 h for cell growth (step 6). 

Fusion of PCA fragments at the chromosomal loci: 5 h (depending on the 
number of samples) plus 48—72 h for cell growth. 

Preparation of cells for Luminometric analysis (steps 1—3): 24 h. 

Microplate reader analysis: a few minutes to hours, depending on the signal 
integration time and the number of samples. 

6.2.5. Anticipated results 

The luminescence intensity of the reassembled Rluc and Glue PCAs vary 
with the strength of interaction between the protein pairs attached to the 
PCA fragments. In the case of our simplest positive control (GCN4 leucine 
zipper pair fused to the PCA fragments: e.g., Zip-PJuc-F[l] + Zip-Rluc- 
F[2]), the reconstituted PCAs represent approximately 10—30% of the activity 
of the full-length PJuc or Glue enzymes. The PCA fusions expressed alone 
should not result in detectable luminescence (compared to nontransfected 
cells) because the individual PCA fragments have no activity. For each study, 
positive (known interaction) and particularly negative (non-interacting pro- 
teins) controls should always be performed in parallel. A PCA response should 
not be observed if no ninter acting proteins are used as PCA partners. 



REFERENCES 

Benton, R., et al. (2006). Atypical membrane topology and heteromeric function of 
Drosophila odorant receptors in vivo. PLoS Biol. 4, e20. 

Cabantous, S., Terwilliger, T. C, et al. (2005). Protein tagging and detection with engi- 
neered self-assembling fragments of green fluorescent protein. Nat. Biotechnol. 23(1), 
102-107. 



Protein-Fragment Complementation Assays in Yeast 367 



Campbell-Valois, F. X., Tarassov, K., et al. (2005). Massive sequence perturbation of a small 

protein. Proc. Natl. Acad. Sci. USA 102(42), 14988-14993. 
Carpenter, A. E., Jones, T. R., et al. (2006). CellProfiler: Image analysis software for 

identifying and quantifying cell phenotypes. Genome Biol. 7(10), R100. 
Choi, K. Y., Satterberg, B., et al. (1994). Ste5 tethers multiple protein kinases in the MAP 

kinase cascade required for mating in S. cerevisiae. Cell 78(3), 499-512. 
Chou, S., Huang, L., et al. (2004). Fus3-regulated Tecl degradation through SCFCdc4 

determines MAPK signaling specificity during mating in yeast. Cell 119(7), 981-990. 
Collins, S. R., Schuldiner, M., et al. (2006). A strategy for extracting and analyzing large- 
scale quantitative epistatic interaction data. Genome Biol. 7(7), R63. 
Collins, S. R., Kemmeren, P., et al. (2007). Toward a comprehensive atlas of the physical 

interactome of Saccharomyces cerevisiae. Mol. Cell Proteomics 6(3), 439-450. 
Ding, Z., et al. (2006). A retro virus-based protein complementation assay screen reveals 

functional AKTl-binding partners. Proc. Natl. Acad. Sci. USA 103, 15014-15019. 
Dudley, A. M., Janse, D. M., et al. (2005). A global view of pleiotropy and phenotypically 

derived gene function in yeast. Mol. Syst. Biol. 1(2005), 0001. 
Ear, P. H., and Michnick, S. W. (2009). A general life-death selection strategy for dissecting 

protein functions. Nat. Methods 6(11), 813-816. 
Ercikan-Abali, E. A., Waltham, M. C, et al. (1996). Variants of human dihydrofolate 

reductase with substitutions at leucine-22: Effect on catalytic and inhibitor binding 

properties. Mol. Pharmacol. 49(3), 430-437. 
Ghaemmaghami, S., Huh, W. K., et al. (2003). Global analysis of protein expression in yeast. 

Nature 425(6959), 737-741. 
Ghosh, I., Hamilton, A. D., et al. (2000). Antiparallel leucine zipper-directed protein 

reassembly: Application to the green fluorescent protein. J. Am. Chem. Soc. 122(23), 

5658-5659. 
Giaever, G., Chu, A. M., et al. (2002). Functional profiling of the Saccharomyces cerevisiae 

genome. Nature 418(6896), 387-391. 
Hu, CD., Chinenov, Y., et al. (2002). Visualization of interactions among bZIP and Rel 

family proteins in living cells using bimolecular fluorescence complementation. Mol. Cell 

9(4), 789-798. 
Jansen, R., and Gerstein, M. (2004). Analyzing protein function on a genomic scale: The 

importance of gold-standard positives and negatives for network prediction. Curr. Opin. 

Microbiol. 7(5), 535-545. 
Knop, M., Siegers, K., et al. (1999). Epitope tagging of yeast genes using a PCR-based 

strategy: More tags and improved practical routines. Yeast 15(1 0B), 963-972. 
Kurtz, J. E., Exinger, F., et al. (1999). New insights into the pyrimidine salvage pathway of 

Saccharomyces cerevisiae: Requirement of six genes for cytidine metabolism. Curr. Genet. 

36(3), 130-136. 
Levy, E. D., Landry, C. R., et al. (2009). How perfect can protein interactomes be? 

Sci. Signal. 2(60), pell. 
Macdonald, M. L., Lamerdin, J., et al. (2006). Identifying off-target effects and hidden 

phenotypes of drugs in human cells. Nat. Chem. Biol. 2(6), 329-337. 
Magliery, T. J., Wilson, C. G., et al. (2005). Detecting protein— protein interactions with a 

green fluorescent protein fragment reassembly trap: Scope and mechanism. J. Am. Chem. Soc. 

127(1), 146-157. 
Manderson, E. N., Malleshaiah, M., et al. (2008). A novel genetic screen implicates Elml in 

the inactivation of the yeast transcription factor SBF. PLoS ONE 3(1), el 500. 
Memarian, N., Jessulat, M., et al. (2007). Colony size measurement of the yeast gene deletion 

strains for functional genomics. BMC Bioinformatics 8, 117. 
Metodiev, M. V., Matheos, D., et al. (2002). Regulation of MAPK function by direct 

interaction with the mating-specific Galpha in yeast. Science 296(5572), 1483-1486. 



368 Stephen W. Michnick et al. 



Michnick, S. W., Remy, I., et al. (2000). Detection of protein— protein interactions by 

protein fragment complementation strategies. Methods Enzymol 328, 208-230. 
Michnick, S. W., Ear, P. H., et al. (2007). Universal strategies in research and drug discovery 

based on protein-fragment complementation assays. Nat. Rev. Drug Discov. 6(7), 

569-582. 
Mumberg, D., Muller, R., et al. (1995). Yeast vectors for the controlled expression of 

heterologous proteins in different genetic backgrounds. Gene 156(1), 119-122. 
Nagai, T., Ibata, K., et al. (2002). A variant of yellow fluorescent protein with fast and 

efficient maturation for cell-biological applications. Nat. Biotechnol. 20(1), 87—90. 
Nyfeler, B., Michnick, S. W., et al. (2005). Capturing protein interactions in the secretory 

pathway of living cells. Proc. Natl. Acad. Sci. USA 102(18), 6350-6355. 
Nyfeler, B., et al. (2008). Identification of ERGIC-53 as an intracellular transport receptor of 

{alpha} 1 -antitrypsin. J. Cell. Biol 180, 705-712. 
Pelletier, J. N., and Michnick, S. W. (1997). A Protein Complementation Assay for 

Detection of protein-protein interactions in vivo. Protein Eng. lO(SuppL), 89. 
Pelletier, J. N., Campbell-Valois, F. X., et al. (1998). Oligomerization domain-directed 

reassembly of active dihydro folate reductase from rationally designed fragments. Proc. 

Natl. Acad. Sci. USA 95(21), 12141-12146. 
Pelletier, J. N., Arndt, K. M., et al. (1999). An in vivo library-versus-library selection of 

optimized protein— protein interactions. Nat. Biotechnol. 17(7), 683-690. 
Remy, I., and Michnick, S. W. (1999). Clonal selection and in vivo quantitation of protein 

interactions with protein-fragment complementation assays. Proc. Natl. Acad. Sci. USA 

96(10), 5394-5399. 
Remy, I., and Michnick, S. W. (2001). Visualization of biochemical networks in living cells. 

Proc. Natl Acad. Sci. USA 98(14), 7678-7683. 
Remy, I., and Michnick, S. W. (2004). A cDNA library functional screening strategy based 

on fluorescent protein complementation assays to identify novel components of signaling 

pathways. Methods 32(14), 381-388. 
Remy, I., and Michnick, S. W. (2006). A highly sensitive protein-protein interaction assay 

based on Gaussia luciferase. Nat. Methods 3(12), 977-979. 
Remy, I., Wilson, I. A., et al. (1999). Erythropoietin receptor activation by a ligand-induced 

conformation change. Science 283(5404), 990-993. 
Remy, I., Montmarquette, A., et al. (2004). PKB/Akt modulates TGF-beta signalling 

through a direct interaction with Smad3. Nat. Cell Biol. 6(4), 358—365. 
Sheff, M. A., and Thorn, K. S. (2004). Optimized cassettes for fluorescent protein tagging in 

Saccharomyces cerevisiae. Yeast 21(8), 661-670. 
Stefan, E., Aquin, S., et al. (2007). Quantification of dynamic protein complexes using 

Renilla luciferase fragment complementation applied to protein kinase A activities 

in vivo. Proc. Natl. Acad. Sci. USA 104(43), 16916-16921. 
Subramaniam, R., Desveaux, D., et al. (2001). Direct visualization of protein interactions in 

plant cells. Nat. Biotechnol. 19(8), 769-772. 
Tannous, B. A., Kim, D. E., et al. (2005). Codon-optimized Gaussia luciferase cDNA for 

mammalian gene expression in culture and in vivo. Mol. Ther. 11(3), 435-443. 
Tarassov, K., Messier, V., et al. (2008). An in vivo map of the yeast protein interactome. 

Science 320(5882), 1465-1470. 
Wilson, C. G., Magliery, T. J., et al. (2004). Detecting protein-protein interactions with 

GFP-fragment reassembly. Nat. Methods 1(3), 255-262. 




CHAPTER FIFTEEN 



Yeast Lipid Analysis and 
Quantification by Mass Spectrometry 

Xue Li Guan,* Isabelle Riezman,* Markus R. Wenk,*' 1 " 
and Howard Riezman* 

Contents 

1. Introduction 370 

2. Methods 373 

2.1. Sample preparation 373 

2.2. Normalization for starting amount of material 375 

2.3. Lipid extraction (for glycerophospholipids and sphingolipids) 375 

2.4. Sterol isolation 376 

2.5. Lipid analysis 377 
Acknowledgments 389 
References 389 

Abstract 

The systematic and quantitative analysis of the different lipid species within a 
cell or an organism has recently become possible and the general approach has 
been termed "lipidomics." Traditional methods of identification and quantifica- 
tion of lipid species were laborious processes and it was necessary to use a 
wide variety of techniques to analyse the different lipid species, especially 
concerning the assigning of particular acyl chain lengths, hydroxylations, and 
desaturations to the diverse lipid species. While it is still not possible to 
quantitatively analyze all lipid species in one fell swoop, great progress has 
been made with the intensive use of quantitative mass spectrometry 
approaches. It is now relatively simple to quantify most of the lipid species, 
including all of the major ones, in a yeast cell. Different degrees of sophistica- 
tion of mass spectrometric analysis exist and the available techniques and 
instrumentation are evolving rapidly. Therefore, we have decided to present 
robust, simple methods to quantify the major yeast lipids by mass spectrometry 
that should be accessible to anyone who has access to a standard mass 
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spectrometry equipment. The methods to identify and quantify yeast glycero- 
phospholipids and sphingolipids involve electrospray ionization mass spec- 
trometry using fragmentation to characterize the lipid species. A simplified 
gas chromatographic method is used to quantify the major sterols that occur 
in wild-type yeast cells and ergosterol biosynthesis mutants. 




1. Introduction 



Studies in yeast, notably Saccharomyces cerevisiae, have played a particu- 
larly important role in the advancement of our knowledge in biology. An 
indispensable aspect of yeast research, in addition to genetics and molecular 
biology, is lipid biochemistry and analysis. Lipids make up a bulk of cellular 
membrane and have gained immense interest due to their emerging new 
cellular functions other than their structural roles. 

Three major classes of membrane lipids in all eukaryotic organisms 
include glycerophospholipids, sphingolipids, and sterols and each class is 
structurally diverse, arising from both nonpolar chains and head group 
substitutions (Fig. 15.1). The glycerophospholipids are diversified by their 
head group and fatty acyl compositions and the major head groups are 
conserved among all eukaryotes. Lipid composition varies between orga- 
nelles (Schneiter et ah, 1999). On the other hand, the sphingolipids 
of S. cerevisiae are distinct and relatively simple compared to mammals. There 
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Figure 15.1 Structures of major membrane lipids in S. cerevisiae: (A) glyceropho- 
spholipids, (B) sphingolipids, (C) ergosterol. R refers to possible head group 
modifications. 
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are three classes of inositol-containing sphingolipids; ino si to 1-phosphory leer- 
amide (IPC), rnannose-inositol-phosphorylcerarnide (MIPC), and mannose- 
(inositol-P) 2 -ceramide (MIP 2 C), localized mainly to the plasma membrane 
where they constitute 20—30% of all the lipids (Patton and Lester, 1991). Their 
biosynthesis begins in the endoplasmic reticulum (ER) and is completed 
in the Golgi apparatus (Dickson et ah, 2006). Another distinct characteristic 
of the lipid composition of a normal yeast cell is that instead of cholesterol, 
it contains approximately 75% ergosterol (ergosta-5,7,22-trienol) and a variety 
of other sterols, most of which are intermediates in the sterol biosynthesis 
pathway (Munn et ah, 1999). Most of the ergosterol is found in the plasma 
membrane and many of the other sterol molecules are enriched in 
the ER where sterol synthesis occurs (Zinser et ah, 1993). Sterols can be 
esterified and stored in lipid bodies from where they can be rapidly mobilized 
(Wagner et ah, 2009). Although the major lipids and the biosynthetic machi- 
neries in S. cerevisiae have been extensively characterized (Daum et ah, 1998; 
Ejsing et ah, 2009; Guan and Wenk, 2006), novel chemical entities may have 
escaped discovery in previous analyses and warrants development of high 
resolution and sensitive methodologies for detection and characterization. 
Furthermore, the molecular signature of individual lipids is increasingly 
recognized to encode for highly specific, though not necessarily unique, 
functions. This has been further supported by recent revelations that the 
intricate interactions between specific lipids and proteins are important for 
cellular functions (Guan et ah, 2009; Valachovic et ah, 2004). High resolution 
analysis of lipids therefore is critical for enhancing our understanding of the 
biological roles of lipids. 

Traditional methods of lipid analysis include metabolic labeling, thin- 
layer chromatography (TLC) and gas chromatography (GC) (Wenk, 2005). 
Metabolic labeling using lipid precursors (such as radiolabeled inositol, 
ethanolamine, or fatty acyls) have been widely used to (selectively) label 
certain classes of lipids, which will typically be separated by TLC and 
visualized by autoradiography. However, these techniques only deliver 
mass levels of lipids under conditions of steady-state incorporation of the 
label. Moreover, generally, TLC separation is of low resolution and does 
not provide molecular information of the diversity of fatty acyl composi- 
tions, which would require subsequent analysis, often by GC. Nonetheless, 
although these methods suffer from limited sensitivity, selectivity, and 
resolution, they are still commonly used today due to the relative ease and 
low investment cost in instrumentation. 

In the last 15—20 years, the field of lipid research has been spurred by the 
development of mass spectrometry (MS), and in particular soft ionization 
such as electrospray ionization (ESI) and matrix-assisted laser desorption 
ionization (MALDI). These methods have allowed rapid and sensitive 
profiling, characterization as well as quantification of lipids in complex 
lipid mixtures (Han et ah, 2008; Shui et ah, 2007). However, despite 
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continuous advancement in the technology, such as achieving enhanced 
sensitivity by microfluidics-based ionization (Han et al, 2008) and high 
resolution by MS such as Fourier transform ion cyclotron resonance (FT- 
ICR) or Orbitrap MS (Schwudke et al, 2007), there is no one method to 
probe the entire lipidome of a cell. 5. cerevisiae has a relatively simple 
lipidome compared to other eukaryotic cells, yet it is only recently that a 
fairly comprehensive analysis of its lipidome has been described, and such a 
detailed catalogue is still currently unavailable for the relatively more com- 
plex lipidomes of cell types in other organisms. Ejsing and coworkers 
described the absolute quantification of 250 lipid molecular species in 
S. cerevisiae and estimated that the coverage of the lipids analyzed encom- 
passes 95% of the yeast lipidome (Ejsing et ah, 2009). Failure in the complete 
annotation and measurement of an entire cellular lipidome is attributed to 
the complex chemistry between different classes of lipids as well as the 
dynamic range of their concentration, which limits their extractability and 
ionization using a single method. In other words, depending on the lipid(s) 
of interest, employing multiple extraction protocols, as well as different 
analytical tools are still required to examine the lipidome for any given 
biological sample. 

In this chapter, we describe the extraction and mass spectrometric analysis 
of various classes of lipids, including glycerophospholipids, sphingolipids, and 
sterols from yeast, specifically 5. cerevisiae. Important considerations for 
isolation of lipids from the cellular milieu include recovery and potential 
activation of lipases during the extraction procedure. While protocols based 
on Folch and Bligh and Dyer methods (Bligh and Dyer, 1959; Folch et al, 
1957) are seemingly efficient for isolating lipids from mammalian cells and 
tissues, these methods are effective for isolating glycerophospholipids and 
sphingolipids from broken yeast cells but not from intact cells (Hanson 
and Lester, 1980). Furthermore, a technical challenge is recovery of the 
entire cellular lipidome due to the diverse chemistry of lipids in cells. Polarity 
of lipids differ depending on the lipid backbone as well as the head group 
modification and therefore the selection of organic solvents is critical. For 
instance, phosphorylation of sphingolipids renders them highly polar, and 
thus they may escape into the aqueous milieu during phase separation using 
Folch or Bligh and Dyer's methods. Here, we describe a modified protocol 
for isolation of lipids from intact yeast cells based on Angus and Lester's 
method (Angus and Lester, 1972). This involves the use of a slightly alkaline 
mixture of ethanol— diethyl ether— water— pyridine— ammonium hydroxide 
at elevated temperatures, which leads to effective extraction of inositol- 
containing lipids and glycerophosphatidylcholine from intact 5. cerevisiae 
and Neurospora crassa (Hanson and Lester, 1980). However, for analysis of 
subcellular organelles, the modified Bligh and Dyer method may be applica- 
ble. An additional isolation method will be added to analyze total sterols, 
including an alkaline hydrolysis and extraction with petroleum ether. 
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Various MS-based methods will be described for the analysis of the 
different classes of lipids, including (1) the direct analysis of major yeast 
glycerophospholipids and sphingolipids in complex lipid mixtures using 
ESI-MS and tandem mass spectrometry (MS/MS) and (2) analysis of sterols 
and metabolites by GC— MS (Fig. 15.2). Lipid analysis of yeast using ESI-MS 
is not a novel technique and dates back to the 1990s (Hechtberger et ah, 
1994; Schneiter et ah, 1999) but comprehensive characterization and quan- 
tification of multiple classes of lipids in an "omic-centric" fashion has only 
been reported in recent years (Ejsing et ah, 2006, 2009; Guan and Wenk, 
2006). For successful MS analysis, often, prior knowledge of the chemistry 
and the fragmentation patterns of lipids of interest is required. Numerous 
works have been reported, particularly for the analysis of glycerophospho- 
lipids and sphingolipids in mammalian cells and tissues (B rugger et ah, 1997; 
Han and Gross, 2005; Sullards, 2000; Taguchi et ah, 2005) and inference can 
be drawn for lipid analysis in yeast. Moreover, often instruments have 
accompanying software that allow automated acquisition (termed data- 
dependent acquisition) (Schwudke et ah, 2006). Despite the wealth of 
information available and the ease of automation, we nonetheless aim to 
provide a detailed description of the operation to allow the reader to 
reproduce the techniques in probing yeast lipidomes as it should be noted 
that the lipid inventory depends on culture conditions (Tuller et ah, 1999) 
and further differs between yeast species (Jungnickel et ah, 2005), for 
example, Schizosaccharomyces pombe versus S. cerevisiae (Shui et ah, under 
revision) . 

Previous methods have been published to analyze yeast sterols by GC 
with or without MS (van den Hazel et ah, 1999; Xu and Nes, 1988; Zinser 
et ah, 1991). These methods usually involve silylation of the isolated sterol 
sample before GC, which improves separation, but which also has some 
disadvantages. In this chapter, we will present a simple protocol for the 
semiquantitative analysis of free and total yeast sterols by GC— MS. The term 
semiquantitative is used because appropriate standards are not commercially 
available for the large variety of sterols present in wild-type (WT) and 
mutant yeast cells and therefore, their exact quantities cannot be measured 
by GC-MS. 




2. Methods 

2.1. Sample preparation 

Cells (10 OD 600 units) are harvested by addition of trichloroacetic acid 
(TCA) to the media to a final concentration of 5% (w/v) and the cells are 
put onto ice. The cooled cells are pelleted by centrifugation at 3000 x^ for 
5 min and the pellets are washed once with ice-cold 5% TCA, followed by 
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Figure 15.2 General strategy for the extraction and analysis of major membrane lipids from yeast. Glycerophospholipids, sphingolipids, and 
sterols are extracted according to respective procedures. The former are analyzed by ESI-MS using various scan modes including single stage 
MS, product ion scan (PIS), precursor ion scan (PREIS), neutral loss (NL) scan, and multiple-reaction monitoring (MRM) for profiling, 
species identification, and quantification. Sterols are analyzed by GC-MS. 
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another wash with ice-cold deionized water. The cells are transferred to 
screw cap tubes and they can be snap-frozen in liquid nitrogen and stored at 
— 80 °C or immediately used for extraction. Except for Teflon, all plastic 
should be avoided in subsequent procedures involving organic solvents 
because they leach contaminants into the solution. Furthermore, handling 
of organic solvents should be carried out in a fume hood. 

2.2. Normalization for starting amount of material 

Two parameters can be used for normalizing the amount of lipids: 

(1) Dry weight: In this case, cells are freeze-dried and weighed before 
extraction. 

(2) Protein amount: An equivalent OD of cells used for lipid extraction is 
used for protein determination using standard assays. 



2.3. Lipid extraction (for glycerophospholipids and 
sphingolipids) 

Cells (10 OD 600 units) are resuspended in 1 ml of extraction solvent contain- 
ing 95% ethanol, water, diethyl ether, pyridine, and 4.2 N ammonium 
hydroxide in the ratio of 15:15:5:1:0.18 by volume. A cocktail of lipid 
standards (Table 15.1) and 100/iL of glass beads are added. It should be 
noted that lipid materials removed from storage should be allowed to reach 
room temperature before opening for use. In addition, the standards are 
stored as concentrated stock (mM to M) for stability, and are diluted in 
ethanol prior to use (Moore et ah, 2007). We routinely sonicate the lipids 
to ensure complete solubilization. 

The sample is vortexed vigorously for several minutes and incubated at 
60 °C for 20 min. Cell debris are pelleted by centrifugation at 10,000 Xg for 
10 min and the supernatant is collected. One milliliter of extraction solvent 

Table 15.1 List of synthetic standards for quantitative analysis of major yeast lipids 









MPJV1 transition 


Standard 


Source 


Ionization mode 


(Q1/Q3) 


Didodecanoyl GPEtn 


Avanti 


Negative 


578.6/196.1 


Didodecanoyl GPSer 


Avanti 


Negative 


622.6/535.5 


Dioctanoyl GPIns 


Echelon 


Negative 


585.5/241.1 


dl8:l/19:0Ceramide 


Matreya 


Positive 


580.7/264.2 


Dodecanoyl, tridecanoyl GPA 


Avanti 


Negative 


549.5/153.1 


Didecanoyl GPGro 


Avanti 


Negative 


553.6/153.1 


Dinonadecanoyl GPCho 


Avanti 


Positive 


818.7/184.1 


Cholesterol 


Avanti 


Negative 


386.3/368.3 
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is added to the pellet, which is vortexed again, and the extraction is 
repeated. The supernatant is pooled together and dried under a stream of 
nitrogen or in a Centrivap (Labconco Corporation, Kansas City, MO). In 
the event of using a Centrivap, the temperature should be raised gradually to 
60 °C to avoid boiling. 

For complex lipids including glycerophospholipids, sphingolipids, and 
neutral lipids, the extract is desalted using butanol and water. The lipid film 
is resuspended in 300 fA of water-saturated butanol and 150 fA of water is 
added. The mixture is vortexed vigorously, followed by centrifugation at 
10,000 Xg for 2 min. The top butanol phase is recovered and the aqueous 
phase is reextracted again with 300 fA of water-saturated butanol. The 
butanolic phase is pooled together and dried under a stream of nitrogen or 
with a Centrivap (50—60 °C). The lipid film is resuspended in 400 fA of 
chloroform— methanol (1:1, v/v) for MS analysis. 

2.3.1. Alkaline methanolysis (optional) 

For analysis of ceramides and complex sphingolipids, an optional step of 
mild alkaline hydrolysis is recommended to reduce ion suppression by other 
lipids such as glycerophospholipids during MS analysis. This treatment 
hydrolyzes acyl bonds (and hence the majority of glycerophospholipids) 
but leaves amide linkages largely intact (Brockerhoff, 1963). Two different 
methods are available. In the first method, the lipid film from the pyridine 
extraction is reconstituted in 400 fA of a mixture of chloroform, methanol, 
and water in the ratio of 16:16:5 by volume. Four hundred microliters of 
0.2 N of methanolic NaOH is added and the mixture is incubated for 1 h at 
37 °C on a thermoshaker with mild shaking. Eighty microliters of 1 N 
acetic acid is then added to neutralize the mixture, followed by the addition 
of 400 fA of 0.5 MEDTA and 400 fA of chloroform. The sample is vortexed 
vigorously for 1 min, followed by centrifugation at 10,000 Xg for 2 min. 
The lower organic phase is recovered and the aqueous phase is reextracted 
with 600 fA of chloroform. The organic phase is pooled together and dried 
under a stream of nitrogen or in the Centrivap (Guan and Wenk, 2006). In 
the second method, the lipid film is reconstituted in 400 fA of monomethy- 
lamine reagent (methanol: H 2 0:n-butanol:methylamine = 4:3:1:5 by vol- 
ume) and incubated for 1 h at 53 °C (Cheng et ah, 2001). The mixture is 
then dried under a stream of nitrogen or in a Centrivap (50—60 °C). The 
lipid extract is then desalted using butanol and water as described above. 

2.4. Sterol isolation 

In order to quantify free, nonesterified sterols, the procedure used for the 
isolation of glycerophospholipids and sphingolipids cannot be used because 
it results in partial hydrolysis of sterol esters. Therefore, cells are extracted 
with a chloroform/methanol procedure. Cells (10 OD 600 units) are 
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harvested as above. Cholesterol (2 nmol) and 100 fA glass beads are added to 
the cell pellet, which is resuspended in 50 fA water and 50 fA methanol is 
added. After vortexing, 100 fA of chloroform is added and the sample is 
vortexed for 6 min followed by centrifugation at 100X£ for 5 min. The 
supernatant is transferred to a new tube and the pellet is vortexed again with 
100 fA of chloroform/methanol (2:1). After centrifugation the supernatants 
are combined. Thirty-four microliters of 0.034% MgCl 2 is added, vortexed, 
and centrifuged at 800 x^ for 5 min. The aqueous phase is removed and 
34 fA of 2 M KCl/methanol (4:1) is added to the lower phase. After 
vortexing and centrifugation the aqueous phase is removed. This procedure 
is repeated two times with 34 fA of artificial upper phase (chloroform: 
methanol: water, 3:48:47). The organic phase is then removed without 
taking the protein interface and dried. Sterol esters are present in this 
preparation, but are difficult to quantify by GC— MS. To determine the 
amounts of total cellular sterols (esterified and nonesterified) the following 
protocol is used. Two nanomoles of cholesterol are added to the cell pellet 
as internal standard. Cells are resuspended with 300 fA 60% KOH to which 
is added 300 fA methanol containing 0.5% pyrogallol and 450 fA methanol 
in a screw cap glass tube. The tube is placed at 85 °C and once hot, the cap is 
tightly closed. After 2-h of incubation with occasional mixing, the sample is 
returned to room temperature and the sterols are extracted three times with 
1 mL of petroleum ether. The combined petroleum ether phases are dried 
and analyzed by GC— MS. 

2.4.1. Solid phase extraction for separation of sterols 
from other lipids (optional) 

A column of Chromabond SiOH (Macherey-Nagel, Germany) (0.1 g) is 
washed with two times 1 ml chloroform. Total lipid extract from 50 OD 600 
units of cells is resuspended in 0.25 ml chloroform by vortexing and 
sonication. The extract is then applied to the column and eluted with two 
times 0.65 ml chloroform. The flow through and chloroform elutions are 
combined and dried (sterol extract). The column is then eluted with three 
times 0.5 ml methanol. The combined methanol elutions are dried and can 
be used as a total lipid extract as above. 

2.5. Lipid analysis 

2.5.1. Direct analysis of glycerophospholipids and sphingolipids 
in complex lipid mixtures by ESI-MS and ESI-MS/MS 

Glycerophospholipids, ceramides, and inositol sphingolipids are analyzed by 
direct infusion using ESI-MS and MS/MS (Fig. 15.3). The samples from 
the butanolic extraction or after alkaline hydrolysis are reconstituted in 
400 fA of chloroform— methanol (1:1, v/v). Samples are centrifuged at 
maximum speed for 10 min to remove any residues. A high-performance 
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Figure 15.3 Analysis of yeast lipids by electrospray ionization-mass spectrometry 
(ESI-MS) and tandem mass spectrometry (MS/MS). (A) Single stage ESI-MS in the 
negative ion mode. The majority of glycerophospholipids and sphingolipids are 
detected in the mass range of 400-1200. The ions can be tentatively assigned by their 
mass-to-charge (m/z) ratio. Characterization of ions can be achieved by collision- 
induced dissociation (CID) and MS/MS. (B) MS/MS spectra of ions with m/z 835. 
(C) Precursor ion scans for lipids containing inositol phosphate head group (m/z 241). 
Samples can be spiked with internal standards, which are typically not found naturally 
in the samples under investigation, to allow for semiquantitative profiling. (D) 
Overlay of chromatogram (left panel) and standard curve (right panel) obtained from 
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liquid chromatography system (Agilent Technologies, Santa Clara, CA) is 
coupled to a 4000 Qtrap triple quadrupole mass spectrometer (Applied 
Biosystems, Foster City, CA) with an ESI source. The mass spectrometer 
is operated in both the positive and negative ion modes, depending on the 
propensity of the class of lipids of interest to ionize (Table 15.2). The spray 
voltage in the positive ion mode is 5.5 and 4.5 kV in the negative mode. 
The MS is operated with a curtain gas of 20 psi, source temperature of 
250 °C and both ion source gas 1 and 2 set to 30 psi. For all MS/MS 
experiments, each individual ion dissociation pathway has to be optimized 
with regard to collision energy and declustering potential to minimize 
variations in relative ion abundance due to differences in rates of dissocia- 
tion (Table 15.3 and Fig. 15.5). Samples are introduced into the mass 
spectrometer by loop injections with chloroform— methanol (1:1 v/v) as a 
mobile phase at a flow rate of 200 /il/min. Typically, 25 jA of samples is 
injected for analysis. 

2.5.1.1. Single stage (nontargeted) profiling The mass spectrometer is 

operated in enhanced MS mode to obtain the profiles of the total lipid extract. 
Measurement of lipids by ESI-MS is based on the ability of each class of lipids 
to acquire positive or negative charges when in solution during ionization 
and the structures of these lipids entail their inherent ionization property. 
Table 15.2 summarizes the charge states of major glycerophospholipids and 
sphingolipids. An example of a single stage MS profile, in the negative 
ion mode, of an extract obtained from S. cerevisiae grown in rich media 
(yeast— peptone— dextrose) to logarithmic phase is shown in Fig. 15. 3 A. Typ- 
ically, the scan range is between mlz 400 and 1200, which includes the 
detection of various classes of lipids including glycerophosphatidylethanola- 
mine (GPEtn), glycerophosphatidylserine (GPSer), glycerophosphatidic acid 
(GPA), glycerophosphatidylglycerol (GPGro), glycerophosphatidylinositol 
(GPIns), ceramide, and inositol sphingolipids. 

The profiling of complex lipid mixtures using single stage MS may serve 
as an initial screen when different conditions or strains are to be compared. 
The comparison of profiles can be achieved by multivariate analysis or 
simply generating a differential profile which displays differences in ion 
response between the two conditions of interest and softwares for analysis, 
alignment, and comparison of full scan MS, are increasingly available (De 
Vos et ah, 2007; Song et ah, 2009; Wong et ah, 2005). Such a nontargeted 
survey may serve as a starting point to reveal perturbations in the lipid 



quantification of varying concentrations of a commercially available 34:2 GPIns by 
multiple-reaction monitoring (MRM). Selective quantification can be attained with a 
reasonably good linearity. Note that 34:2 GPIns is a minor ion in the complex lipid 
mixture and MPJV1 offers a selective and sensitive method for quantification. 
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Table 15.2 Precursor ions, MS/MS scan modes and associated parameters for analysis of major glycerophospholipids and sphingolipids 
in 5. cerevisiae 



Lipid 



Precursor ion MS/MS modes Fragment 



Glycerophospholipid 

GPA [M - H]~ 

GPGro [M - H]~ 

[M + NH4] 
GPCho [M + H] + 

GPEtn [M - H]~ 



GPSer 



GPIns 



Sphingolipid 
Phytoceramide 



[M + H] + 
[M - H]~ 
[M - H]~ 
[M + H] + 
[M - H]~ 
[M - H]~ 
[M + NH4] 

[M + H] + 
[M + H] + 
[M + H] + 
[M + H] + 



PI^EIS 153 
PREIS 153 
NL 189 
PREIS 184 
PI^EIS 196 

NL 141 
NL87 
PI^EIS 153 
NL 185 
PI^EIS 241 
PI^EIS 153 
NL277 

PMIS 266 
PREIS 282 
PREIS 294 
PREIS 310 



Glycerophosphate derivative 
Glycerophosphate derivative 
Phosphoglycerol 
Phosphocholine 
Glycerophosphoethanolamine 

derivative 
Phosphoethanolamine 
Serine 

Glycerophosphate derivative 
Phosphoserine 
Cyclic inositol phosphate 
Glycerophosphate derivative 
Phosphoinositol 

Double dehydration product 

of dl8:0 sphingoid base 
Double dehydration product 

of tl8:0 sphingoid base 
Double dehydration product 

of d20:0 sphingoid base 
Double dehydration product 

of t20:0 sphingoid base 



Mass 


Declustering 


Collision 


range (m/z) 


potential (V) 


energy (V) 


370-800 


-75 


- 40 to - 60 


400-800 


-75 


- 40 to - 60 


400-900 


70 


45-65 


400-800 


-75 


- 40 to - 65 


400-800 


-75 


-25 


400-800 


-75 


- 40 to - 60 


450-900 


-90 


- 45 to - 60 


450-900 


-75 


-45 to -65 


600-800 


80 


40-50 




80 


40-50 




100 


40-50 




100 


40-50 



IPC 



MIPC 



[M - H] 
[M + H] 

[M + H] 

[M + H] 

[M + H] 

[M - H] 
[M + H] 

[M + H] 

[M + H] 

[M + H] 



PREIS 241 
VKEIS 266 

PMIS 282 

PI^EIS 294 

PMIS 310 

PI^EIS 421 
PI^EIS 266 

PMIS 282 

PI^EIS 294 

PMIS 310 



M(IP) 2 C 



[M - 2H]2" PI^EI 241 



Cyclic inositol phosphate 
Double dehydration product 

of dl8:0 sphingoid base 
Double dehydration product 

of tl8:0 sphingoid base 
Double dehydration product 

of d20:0 sphingoid base 
Double dehydration product 

of t20:0 sphingoid base 
Cyclic inositol phosphate 
Double dehydration product 

of dl8:0 sphingoid base 
Double dehydration product 

of tl8:0 sphingoid base 
Double dehydration product 

of d20:0 sphingoid base 
Double dehydration product 

of t20:0 sphingoid base 
Cyclic inositol phosphate 



800-1000 


-120 


-60 to 




100 


60-70 




100 


60-70 




100 


60-70 




100 


60-70 


950-1200 


-160 


-60 to 




100 


75-80 




100 


75-80 




100 


75-80 




100 


75-80 


600-750 


-180 


-65 to 



-70 



-70 



75 



00 



Table 15.3 Precursor/product ion m/z's for multiple-reaction monitoring of major species of S. cerevisiae glyerophospholipids 



Molecular 


GPA 




GPGro 


GPEtn 


GPSer 


GPIns 


GPCho 


Ql 




Ql 




Ql 




Ql 




Ql 




Ql 




species 


[M - H]" 


Q3 


[M - H]~ 


Q3 


[M + H] + 


Q3 


[M - H]~ 


Q3 


[M - H]~ 


Q3 


[M + H] + 


Q3 


14:0 


(Lyso) 


379.4 


153.1 


453.4 


153.1 


426.4 


285.5 


466.4 


379.4 


543.4 


241.1 


468.4 


184.1 


16:1 


(Lyso) 


407.4 


153.1 


481.4 


153.1 


452.4 


311.6 


494.4 


407.4 


569.4 


241.1 


494.4 


184.1 


16:0 


(Lyso) 


409.4 


153.1 


483.4 


153.1 


454.4 


313.5 


496.4 


409.4 


571.4 


241.1 


496.4 


184.1 


18:1 


(Lyso) 


435.4 


153.1 


509.4 


153.1 


480.4 


339.6 


522.4 


435.4 


597.4 


241.1 


522.4 


184.1 


18:0 


(Lyso) 


437.4 


153.1 


511.4 


153.1 


482.4 


341.5 


524.4 


437.4 


599.4 


241.1 


524.4 


184.1 


26:1 




561.6 


153.1 


635.6 


153.1 


606.7 


465.7 


648.7 


561.7 


723.7 


241.1 


648.6 


184.1 


26:0 




563.6 


153.1 


637.6 


153.1 


608.7 


467.7 


650.7 


563.7 


725.7 


241.1 


650.6 


184.1 


28:1 




589.6 


153.1 


663.7 


153.1 


634.7 


493.7 


676.7 


589.7 


751.7 


241.1 


676.7 


184.1 


28:0 




591.6 


153.1 


665.7 


153.1 


636.7 


495.7 


678.7 


591.7 


753.7 


241.1 


678.7 


184.1 


30:1 




617.6 


153.1 


691.7 


153.1 


662.7 


521.7 


704.7 


617.7 


779.7 


241.1 


704.7 


184.1 


30:0 




619.6 


153.1 


693.7 


153.1 


664.7 


523.7 


706.7 


619.7 


781.7 


241.1 


706.7 


184.1 


32:2 




643.7 


153.1 


717.7 


153.1 


688.7 


547.7 


730.7 


643.7 


805.7 


241.1 


730.7 


184.1 


32:1 




645.7 


153.1 


719.7 


153.1 


690.7 


549.7 


732.7 


645.7 


807.7 


241.1 


732.7 


184.1 


32:0 




647.7 


153.1 


721.7 


153.1 


692.7 


551.7 


734.7 


647.7 


809.7 


241.1 


734.7 


184.1 


34:2 




671.7 


153.1 


745.7 


153.1 


716.7 


575.7 


758.7 


671.7 


833.7 


241.1 


758.7 


184.1 


34:1 




673.7 


153.1 


747.7 


153.1 


718.7 


577.7 


760.7 


673.7 


835.7 


241.1 


760.7 


184.1 


34:0 




675.7 


153.1 


749.7 


153.1 


720.7 


579.7 


762.7 


675.7 


837.7 


241.1 


762.7 


184.1 


36:2 




699.7 


153.1 


773.7 


153.1 


744.7 


603.7 


786.7 


699.7 


861.7 


241.1 


786.7 


184.1 


36:1 




701.7 


153.1 


775.7 


153.1 


746.7 


605.7 


788.7 


701.7 


863.7 


241.1 


788.7 


184.1 


36:0 




703.7 


153.1 


777.7 


153.1 


748.7 


607.7 


790.7 


703.7 


865.7 


241.1 


790.7 


184.1 
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profiles which cannot be easily anticipated (Guan et ah, 2006). For instance, 
Fig. 15.4 represents profiles obtained from a total lipid and sphingolipid 
extract obtained from WT cells and cells deficient in the acyltransferase, 
Skip. Note in panel A that the signal of a prominent inositol-phosphor- 
ylceramide (IPC, mlz 952, asterisk) is substantially increased after alkaline 
hydrolysis. 
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Figure 15.4 Glycerophospholipid and sphingolipid profiling of yeast mutants. (A) 
Typical phospholipid (left panels) and sphingolipid profiles (right panels) of wild-type 
(WT) yeast and (B) deletion mutants of the slcl gene. Mass spectra from at least four 
independent samples were averaged for each condition (n = 4). (C) Differential lipid 
profile, which are ratios of single stage mass spectrometry scans plotted as log 10 ratios, 
are used to compare differences in glycerophospholipid and sphingolipid composition 
between Aslcl and WT. Note that this approach does not require knowledge of the 
underlying lipid species for a given ion of interest. Instead, it serves as an "unbiased" 
screening tool for discovery of lipids which are present in different amounts between 
two conditions. Reproduced and modified from Guan and Wenk (2006). 
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The spectra are aligned and the differences between ion intensity between 
the two conditions (Aslcl vs. wt) are computed to generate the differential 
profiles as shown in Fig. 1 5.4C. The major differences lie in ions at m/z 725 and 
753, which, based on the mass, can tentatively be assigned to GPIns with a total 
fatty acyl carbon number of 26 and double bonds (26:0-GPIns) as well as 
28:0-GPIns. The precise identity of ions can be confirmed by MS/MS. 

A nontargeted profiling approach offers the advantage of discovering a 
previously uncharacterized lipid moiety or an unexpected lipid mediator 
under a given condition. However, the likelihood of missing a relevant lipid 
cannot be ruled out because of the incomplete detection of all lipid species 
due to the dynamic range and complex chemistries of lipids in a mixture, 
which is beyond the capacity of existing instrumentations. For instance, 
sterol is not detected under this condition and requires an alternative 
method (see Section 2.5.2). 

2.5.1.2. Characterization by tandem mass spectrometry To characterize 

an ion of interest, the mass spectrometer is set to product ion scan (PIS) mode, 
in which the first quadrupole is set to transmit a selected precursor ion, for 
instance, m/z 835 in negative ion mode. These ions enter the second quadru- 
pole, often referred as the collision cell, where they are fragmented by collision- 
induced dissociation (CID) and the fragments are resolved by the third 
quadrupole which scans over a mass range, typically starting from m/z 70 to 
the m/z of the precursor ion selected. In this instance, CID of m/z 835 with 
collision energy of 55 V and declustering potential of — 100 V yields fragments 
with m/z 153, 241, 255, and 281, which are dehydrated glycerophosphate 
ions, inositol phosphate ions, and decarboxylate ions of palmitic acid (16:0), 
and oleic acid (18:1), respectively (Fig. 15.3B). This identifies the ion to be a 
34:1 GPIns. Public databases to facilitate interpretation of MS/MS data for a 
range of lipids are available, including LipidSearch (Taguchi et al, 2007), 
LipidMaps MS tool (Fahy et al, 2007), and LipidQA (Song et al, 2007). 

2.5.1.3. Targeted profiling and semiquantitative analysis of lipids by 
class using precursor ion and neutral ion scans For selective detection 

and semiquantification of specific class of lipids, the mass spectrometer is 
operated in precursor ion (PREIS) or neutral loss (NL) scan. Table 15.2 
summarizes the PREIS and NL scan modes for major glycerophospholipids 
and sphingolipids in 5. cerevisiae. In the PREIS scan mode, the first quadru- 
pole scans over a selected mass range and the ions sequentially enter the 
collision cell where collision energy is applied to induce fragmentation. The 
third quadrupole is set to transmit a selected single product ion, which is 
typically the diagnostic fragment of a specific class of lipid. For instance, a 
PREIS of m/z 241 in the negative mode scans ions with a phosphoinositol 
group which results in a mass spectrum that contains all precursor ions that 
decompose to produce the fragment with m/z 241 and in yeast, this 
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generates a profile of GPIns, IPC, as well as doubly charged ions of M(IP) 2 C 
(Fig. 15.3C). It should however be noted that a 15:0 fatty acyl shares the 
same ml z as the dehydrated phosphoinositol fragment and such data should 
be handled with care. Odd chain fatty acids are uncommon in S. cerevisiae 
cultured in rich media, and therefore the fragment o£mlz 241 is considered 
specific for phosphoinositol-containing lipids. 

An NL scan is used to profile several classes of glycerophospholipids 
when the charge of the head group does not localize to the lipid head group 
after fragmentation. In this mode, the first and third quadrupoles are linked 
and scanned at the same speed over the same mass range with a constant 
mass difference, for instance 87 amu in negative mode for the loss of serine, 
between the two analyzers. Because of the mass offset at any time, the third 
quadrupole will transmit product ions with a fixed lower ml z value than the 
mass selected precursor ions transmitted by the first quadrupole. The resul- 
tant mass spectrum contains all the precursor ions that lose a neutral species 
of selected mass, in the case of an NL of 87 amu, a glycerophosphatidylser- 
ine profile is generated. 

2.5.1 .4. Quantitative analysis by multiple-reaction monitoring Multiple- 
reaction monitoring (MRM) is a highly selective and sensitive method for 
quantification of specific lipids of interest. Quantitative analysis requires the 
use of internal standards, which are typically stable-isotope incorporated 
lipids or unnatural lipids which are usually chemically synthesized 
(Table 15.1). In MRM experiments, the first quadrupole is set to pass the 
precursor ion of interest to the collision cell where it undergoes CID. The 
third quadruple, Q3, is set to pass the structure-specific product ion charac- 
teristic of the lipid of interest. Again, these transitions require user-defined 
input of the parent ion (Ql) and product ion (Q3). Figure 15.5 and 
Table 15.3 summarize the MRM transitions for several major yeast sphingo- 
lipids and glycerophospholipids for quantification of these lipids in our 
previous work (Guan et ah, 2009; Guan and Wenk, 2006). A chromatogram 
is generated (Fig. 15. 3D) and the area under each curve is integrated for each 
lipids measured. Lipid concentrations are calculated relative to the relevant 
internal standards which have been spiked in based on the amount of starting 
material and is expressed as moles (M)/mg dry weight or M/mg protein. It 
should be noted that the absolute quantification of inositol-containing lipids 
was limited by the availability of pertinent standards but such is not the case 
with the production of nonnatural lipids by genetic manipulation of yeast 
strains (Ejsing et ah, 2009). 

2.5.2. Sterol analysis by GC-MS 

Extracts are analyzed by GC— MS as follows. Samples are injected into a 
VARIAN CP-3800 gas chromatograph equipped with a Factor Four Cap- 
illary Column VF-5ms 15 m X 0.32 mm i.d. DF = 0.10 and analyzed by a 
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Figure 15.5 Summary of sphingolipid pathway of S. cerevisiae and precursor/product ion m/z's and associated parameters for MRM 
detection of individual molecular species of sphingoid bases and complex ceramides. Reproduced and modified from Guan and Wenk 
(2006). Abbreviations: DP, declustering potential; CE, collision energy. 
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Varian 320 MS triple quadrupole with electron energy set to —70 eV at 
250 °C. Samples are applied with the column oven at 45 °C, held for 4 min, 
then raised to 195 °C (20 °C/min). Sterols are eluted with a linear gradient 
from 195 to 230 °C (4 °C/min), followed by raising to 320 °C (10 °C/min). 
Finally, the column temperature is raised to 350 °C (6 °C/min) to elute 
sterol esters. This protocol allows sufficient separation of sterols as judged by 
elution of standards and extracts from ergosterol biosynthesis mutants of 
known sterol composition. An example of a chromatogram of a sample of 
WT yeast is shown (Fig. 15.6) and the profiles of a crude extract and one 
purified over SPE are compared. Standard curves can be constructed by 
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Figure 15.6 Use of SPE separation for GC-MS analysis of sterols. Total lipid extracts 
(top three panels) or the chloroform eluate from an SPE column of the same extract 
were analyzed by GC-MS as described. The total ion intensity or intensities of the 396 
(ergosterol) or 386 (cholesterol standard) are shown. Separation by SPE eliminates the 
peaks appearing at around 25 min, which are products of glycerophospholipids and 
other substances and slightly improves the resolution in the region where sterols are 
found (15-20 min). 
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Figure 15.7 GC-MS analysis of wild-type (WT) and erg2 mutant strain. The portion 
of the GC-MS profile of the WT and erg2 mutant strains where sterols elute is shown. 
The major ion at 396 is ergosta-5,7,22-trienol in WT (A), as seen by its fragmentation 
pattern (panel A) compared to the pattern found in the NIST library. Exiting the GC 
column 0.5 min earlier is the major ion at 396 in the ergl mutant, ergosta-5,8,22-trienol (B), 
whose fragmentation pattern is very similar, but not identical to ergosterol (panels B), also 
similar to the profile from the NIST library. Fragmentation occurs in the source. 
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extracting data for the relevant ions for known amounts of cholesterol (386) 
and ergosterol (396). Compounds are identified by their retention times 
(compared to standards) and fragmentation patterns, which are compared 
to the NIST library or previously characterized sterols from WT and mutant 
yeast cells. Most stereoisomers can be separated by this method. In general, 
stereoisomers with a double bond at position 8 elute before the corresponding 
isomer with a double bond at position 7 (Fig. 15.7). It is important to pay 
attention to the ion used for quantification because the major ions may not be 
of similar intensities for different sterols. For example, the 396 ion for ergosterol 
is quite a bit more intense than the same ion for ergosta-5,8,22-trienol. To 
compare these sterols it is best to use a unique, but more representative ion (see 
Fig. 15.7, e.g., 363) or total ion counts, if sterols are sufficiently well separated. 
We have recently published data on the major sterols found in five isogenic 
ergosterol biosynthesis mutants (Guan et ah , 2009) and these strains can be used 
as sources to determine the retention times of yeast sterols that are not 
commercially available. 
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Abstract 

Driven by the advent of metabolomics, recent years have seen renewed interest 
in the investigation of yeast metabolism. Here we provide a practical guide to 
metabolomic analysis of yeast using liquid chromatography-mass spectrometry 
(LC-MS). We begin with background on LC-MS and its utility in studying yeast 
metabolism. We then describe key issues involved at each step of a typical 
yeast metabolomics experiment: in experimental design, cell culture, metabo- 
lite extraction, LC-MS, and data processing and analysis. Throughout, we 
highlight interdependencies between the steps that are relevant to developing 
an integrated workflow which effectively leverages LC-MS to reveal yeast 
biology. 




1. Introduction 



Liquid chromatography— mass spectrometry (LC— MS) has emerged as 
a preferred tool for measuring the small molecule components of cellular 
metabolism. LC— MS enables simultaneous analysis of dozens to hundreds of 
chemical species. It has already facilitated significant discoveries in yeast, 
including identification of metabolites that oscillate in phase with a meta- 
bolic cycle (Tu et al., 2007), or that respond to shifting glucose or ammonia 
availability (Brauer et al., 2006; Wu et al., 2006a). Future applications hold 
promise for decoding regulation of yeast metabolism, as well as central 
mysteries in yeast physiology, such as the molecular control of growth 
rate and cell size. 

To achieve these benefits, it is important for metabolomic capabilities to 
be widely available to the yeast community. Here we aim to provide a 
practical resource for initiating LC— MS-based metabolomic studies of yeast. 
The focus is on quantitating the concentrations of metabolites, although 
many of the concepts also apply to probing metabolic flow ("flux"). 
We begin with a brief description of LC— MS fundamentals. The order of 
the subsequent sections then follows roughly the steps of a basic metabo- 
lomics experiment: experimental design, cell culture, metabolite extraction, 
LC— MS, and data processing and analysis. 
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2. LC-MS Basics 

LC— MS involves three fundamental steps: LC separation, ionization, 
and separation and quantitation of the ions by MS. The main benefits of 
LC— MS as an analytical tool are its specificity and sensitivity. Specificity 
arises from two orthogonal dimensions of separation: chromatographic 
retention time (RT) and mass-to-charge ratio (m/z). In modern instru- 
ments, specificity is enhanced by high mass resolution and/or multiple 
rounds of MS ("tandem mass spectrometry" or "MS/MS"). 

The main downside of LC— MS is imperfect quantitative capabilities. 
On the positive side, for a given analyte and LC— MS method, signal 
intensity and concentration are often linear over a broad dynamic range 
(> 100-fold). On the downside, measurements are intrinsically noisy, with 
typical relative standard deviations (i.e., the standard deviation of signal 
intensity across repeated runs of the same sample, divided by the average 
signal intensity) in the range of 10—20%. Moreover, the relationship 
between MS signal and concentration depends strongly on the analyte 
structure. Differences in response factor (i.e., the ratio of signal intensity 
to analyte concentration) of > 100-fold are common across analytes, mean- 
ing that equimolar solutions of two different compounds can vary in 
measured signal by orders of magnitude. Response factors also vary across 
instruments, and, for a given instrument and analyte, tend to drift over time. 
Relative ion intensities across samples can therefore be used only to approxi- 
mate relative concentrations, and do so reliably only when the samples are 
analyzed in close succession. 




3. Experimental Design 

The most straightforward application of LC— MS is measurement of 
relative concentration changes in known metabolites: a set of samples is 
generated, analyzed by LC— MS, and signals compared for different known 
metabolites. LC— MS sensitivity fluctuates even within a day. As such, 
biological comparators, not replicates, should be run in direct succession 
(e.g., the order A-B-A-B, not A-A-B-B). Such running order also mitigates 
errors due to metabolite instability, which can be particularly problematic 
for unstable species such as nucleotide triphosphates, CoA compounds, 
folates, and NAD(P)H. Even for species that are stable on their own in 
solution, reactions can occur with other metabolome components (e.g., 
nucleophiles like amines can react with electrophiles like carbonyls). 
To minimize systematic errors due to metabolic instability, it is standard 
practice in our lab to analyze samples within 24—36 h of their generation. 
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The most rigorous way of ruling out analytical artifacts is to use isotopic 
internal standards. However, pure isotope-labeled standards for most meta- 
bolites are not available. An alternative is to generate an isotope-labeled 
standard mixture from cells fed uniformly C-glucose (Mashego et al, 
2004). Unfortunately, such cell-derived standards introduce a myriad of 
new species into already complex metabolome samples. Accordingly, we 
typically spike our samples with a limited set of pure isotope-labeled internal 
standards, which we use to check analytical performance. Assuming ade- 
quate results for the standards (i.e., no systematic variation across biological 
conditions), we draw biological conclusions based on ratios of signal inten- 
sities across samples without correction for internal standard signals. The 
most biologically significant results are then validated through alternative 
experimental approaches. 

Beyond relative quantitation of known metabolites, LC— MS can enable 
discovery of novel metabolites. Measurement of unanticipated metabolites 
("untargeted analysis") can be done in parallel with quantitation of known 
("targeted analysis"), but introduces some additional considerations: 

(1) Certain LC— MS techniques, especially those relying on MS/MS to 
achieve specificity (e.g., triple quadrupole MS), are better suited to 
targeted analysis than metabolite discovery. For untargeted analysis, 
instruments with good mass resolution and full scan sensitivity (i.e., 
sensitivity when scanning over a range of molecular weights using MS, 
without relying on MS /MS) are preferred (e.g., time-of-flight (TOF) 
or Orbitrap). See Section 11 for details. 

(2) Untargeted analysis generates a large amount of raw data. Appropriate 
computational tools are needed to find compounds of interest. 

(3) Each analyte present in the samples may generate multiple adduct ions. 
These vary depending on the ionization mode, that is, whether the ion 
source is set to generate, and the mass spectrometer set to detect, 
positive or negative ions. In positive ion mode LC— MS, quantitation 
is typically based on [M + H] + ; however, one may also see [M + 
Na] , [M + K] , etc. In negative mode LC— MS, quantitation is typi- 
cally based on [M — H]~; however, one may see [M + Na — 2H]"~, 
[M + K — 2H]~, etc. For a more comprehensive list of ionization 
adducts, see Table 16.2, under "Untargeted Analysis." 

(4) Although accurate mass is a powerful tool for identifying molecular 
formulas (and often for pulling up candidate metabolite structures 
through publically available compound databases), compound purifica- 
tion and nuclear magnetic resonance (NMR) analysis are typically 
required to assign structures to genuinely novel compounds. 

LC— MS can also be used for absolute metabolite quantitation. Absolute 
quantitation enables calculation of mass action ratios, and thereby the ther- 
modynamically favored direction of net metabolite flux (Henry et al, 2007; 
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Kiimmel et ah, 2006). Absolute metabolite concentrations also dictate the 
extent of saturation of enzyme sites, and thus the sensitivity of reaction rates 
to changes in substrate, product, and competitive inhibitor concentrations 
(Bennett et ah, 2009). 

The optimal approach to absolute metabolite quantitation involves 
spiking isotope-labeled internal standard compounds into the yeast extrac- 
tion solvent. The ratio of unlabeled-to-labeled compound signal, multiplied 
by standard concentration, then gives the endogenous cellular compound 
concentration. When isotope-labeled standard is not available, one can 
instead label the cells by feeding isotope-labeled nutrient. More readily 
available unlabeled compounds can then be used as the internal standards 
(Mashego et ah, 2004; Wu et ah, 2006b). In such cases, it is advisable to 
correct for incomplete labeling of cellular metabolites (Bennett et ah, 2008). 
With either approach, it is important to add the internal standard directly 
into the extraction solution, to correct for the substantial compound losses 
(e.g., to adsorption or degradation) that often occur during extraction. 

An alternative use of isotope labels involves probing metabolic flux 
(Moxley et ah, 2009; Sauer et ah, 2007). While full flux quantitation 
involves experimental and data analysis challenges that are beyond the 
scope of this chapter, useful information can be achieved through relatively 
simple LC— MS experiments. These break down into two main categories: 
studies of labeling kinetics and studies of steady-state isotope patterns. 

To investigate labeling kinetics, cells are transferred from normal media 
to media containing an isotope-labeled nutrient (e.g., N-ammonia or 
C-glucose). Samples are taken at various time points thereafter, and the 
rate of labeling of cellular metabolites quantitated by LC— MS. Fast labeling 
generally implies high flux from the labeled nutrient to the measured 
metabolite. If the structure of the metabolic network is known, then quan- 
titative flux determination is often possible (Munger et ah, 2008; Yuan et ah, 
2008). If not, qualitative interpretation is still useful. For example, one can 
categorize potential novel metabolites into high flux species (potentially 
central players in metabolism) versus low flux ones (potentially degradation 
products of limited biological importance). One can also qualitatively assess 
major flux changes between conditions, for example, whether synthesis of a 
particular metabolite shuts off during starvation (Brauer et ah, 2006). 

Studies of steady-state isotope patterns involve feeding a mixture of labeled 
and unlabeled carbon source, and then assessing the isotopomer patterns of 
different cellular species. This approach is particularly useful for determining 
which of two alternative pathways is the main route to a particular 
metabolic product. For example, in yeast, oxaloacetate can be produced by 
either malate dehydrogenase (turning of TCA cycle) or pyruvate carboxylase 
(anapleurosis) . Feeding a mixture of unlabeled and labeled glucose results in 
different isotopomer patterns for malate versus pyruvate; the extent to which 
oxaloacetate, or its transamination product, aspartate, mirrors one versus the 
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other provides the relative pathway fluxes. Similar logic applies at other points 
of metabolic conversion, and can be used to deduce system-wide fluxes 
(Antoniewicz et ah, 2007; Sauer, 2006; Wiechert and Noh, 2005). 




4. Strains 

As is typical in cell biology, selection of strain and culture conditions is 
critical. Although most yeast strains are suitable for metabolomic analysis, 
metabolic differences between strains are common and should be consid- 
ered. For example, S288c and its derivations are HAP1 deficient (Gaisne 
et ah, 1999). Accordingly, although numerous stain collections exist in an 
S288c background (Brachmann et ah, 1998; Ho et ah, 2009; Sopko et ah, 
2006), CEN.PK may be preferred for some metabolomic applications (van 
Dijkenef */., 2000). 

Many strain collections contain auxo trophies as selectable markers. 
While auxotrophies have only minor impact in many experiments, they 
typically have a large impact on the metabolome, changing the concentra- 
tions of compounds both up- and downstream of the block. For example, 
knockout of the URA3 gene (a common selectable marker) dramatically 
alters the cellular concentrations of pyrimi dines and their precursors. 
Moreover, it can also impact the concentrations of more distantly related 
metabolites like glutamate (Boer et ah, 2010). Accordingly, antibiotic resis- 
tance markers are generally preferable to auxotrophic markers. 




5. Culture Conditions 

Typical means of culturing yeast for metabolomic studies include batch 
liquid culture, continuous culture in a chemostat, and batch culture on a filter 
support. The major advantages of batch liquid culture are its common use 
and ease; the major advantages of chemostats are control of culture condi- 
tions and reproducibility; and the major advantages of filter culture are facile 
manipulation of the cellular nutrient environment and ease of sampling. 
Irrespective of the culture approach selected, use of a defined media (such 
as yeast nitrogen base) is recommended, as rich medium contains many small 
molecule components that can interfere with analysis of cellular metabolites. 

5.1. Batch liquid culture and chemostats 

Batch liquid culture involves allowing cells to reproduce in a fixed volume of 
liquid. Due to consumption of nutrients and excretion of waste products, the 
culture conditions change continually as the cells grow. When more consistent 
culture conditions are desired, chemostats provide a valuable tool. Chemostats 
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are stirred vessels into which fresh medium (typically having a low concentra- 
tion of one essential nutrient) is continually being flowed. Addition of the 
medium provides cells in the vessel with the limiting nutrient, while washing 
out a portion of the cells. In such a culture system, cells will grow until their 
replication rate exactly matches the rate of their washout from the vessel. Thus, 
the experimenter can set the steady-state growth rate of the cells based on the 
rate of medium addition to the chemostat (which, when normalized by the 
chemostat volume, is referred to as the dilution rate). The Dunham lab 
maintains a useful manual detailing chemostat operation (Dunham, 2010). 

Beyond generating highly reproducible steady-state cultures, the che- 
mostat approach is valuable for enabling studies of cellular composition as a 
function of growth rate and nature of the limiting nutrient (Brauer et ah , 
2008). It allows mutants that differ in growth rate in batch culture to be 
compared at a constant growth rate. Furthermore, nutrient-limited cultures 
can be pulsed with the limiting species, and dynamic response to relief of 
nutrient limitation measured (Wu et ah, 2006a,b). In such studies, the spiked 
nutrient can be isotope labeled, enabling direct tracing of its assimilation by 
MS (Aboka et ah, 2009). 

Sampling metabolites from either liquid batch or chemostat culture 
presents similar challenges: capturing a discrete volume of culture fluid; 
separating cells from the surrounding medium without altering their meta- 
bolome; and effectively extracting metabolites from the isolated cells. The 
standard literature approach involves mixing culture media directly with 
cold (<— 40 °C) methanol to quench metabolism, centrifuging in a pre- 
chilled rotor (<— 20 °C) to isolate the cells, and subsequently extracting 
the cell pellet (Canelas et ah, 2008). In this method, the initial mixing of 
cells with cold methanol aims to quench metabolism, by cold-induced 
slowing of reaction rates and/or organic-induced enzyme denaturation. 
It aims to avoid, however, leakage of the cells due to membrane disruption 
(e.g., due to cold-induced formation of ice crystals that puncture mem- 
branes, or organic-induced membrane dissolution). After the quenched 
cells are isolated by centrifugation, further exposure to organic solvent 
results in membrane disruption and extraction of metabolites from the 
cells. The main risk in this method is metabolite losses (e.g., due to cell 
leakage) during the centrifugation step. Another downside is the somewhat 
laborious nature of the process. An alternative approach involves separation 
of the cells from medium prior to quenching metabolism, using fast filtra- 
tion. The cell-loaded filter is then simultaneously quenched and extracted 
by placing it into cold extraction solvent. The main deficiency of this 
approach is the potential for metabolome changes during filtration. 

As the deficiencies of the two approaches do not overlap, if similar results are 
obtained using both approaches, one can have high confidence in their reliabil- 
ity. We have recently found that both approaches lead to identical biological 
conclusions for steady-state chemostat cultures limited for a diversity of nutrients 
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(Boer et ah, 2010), suggesting that either approach is acceptable for steady-state 
yeast cultures. The major difference was higher absolute signal for many com- 
pounds, especially nucleotide triphosphates, using the filtration-based approach 
(see Fig. 16.2). Thus, for steady-state or steadily growing yeast cultures, 
filtration provides a convenient option. (Note: For Escherichia coli, in contrast, 
we have observed metabolome changes during filtration and do not recommend 
its use without further validation; in addition, E. coli leak quickly upon exposure 
to organic, so the approach of quenching and then centrifuging is also problem- 
atic for them.) Even for steady-state yeast, key biological conclusions should still 
be confirmed using an approach involving faster quenching. For cultures under- 
going fast metabolome changes (e.g., due to an acute nutrient perturbation), 
filtration is unacceptably slow and a faster quenching approach is needed. 
Relevant harvesting protocols are provided below. 

5.2. Protocol for harvesting yeast by vacuum filtration 

1. Construct a filtration apparatus as shown in Fig. 16.1. 

2. Check for leaks. 

3. Thoroughly rinse the apparatus with purified water. 

4. Place a 25-mm 0.45-/im pore size nylon filter on the filter base and 
pre-wet with purified water. 

5. Connect a 15-ml centrifuge tube to rubber stopper at bottom of filtra- 
tion apparatus, forming a tight seal (see Fig. 16.1). This tube will be used 
to collect the extracellular media. 

6. Measure 10 ml of culture, using either a 15-ml centrifuge tube or a 
volumetric pipette. Recommended culture density at time of extraction is 
-4 x 10 7 cells/ml (Klett - 130 or OD 650 ~ 0.5). 

7. Pour culture immediately into the glass cylinder at top of filtration 
apparatus. Filtration should occur rapidly. 

8. When filtration appears complete, wait ~ 1 s and then remove the clamp 
to free the filter. 

9. Immediately place the filter, cells side down, into a 35-mm Petri dish 
containing 700 fA prechilled (—20 °C) extraction solvent (for details, see 
Section 6.1). Time from initiation of sampling to quenching should not 
exceed 30 s. For subsequent steps, see Section 6.1. 



5.3. Protocol for harvesting yeast by centrifugation after 
methanol quenching 

1. Prechill a centrifuge rotor capable of handling 50-ml centrifuge tubes to 
-80 °C. 

2. Directly quench 10 ml culture broth into 20 ml — 80 °C methanol in a 
50-ml centrifuge tube. 
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Figure 16.1 Example of a filtration apparatus. The two-hole rubber stopper seals the 
top of a 15-ml centrifuge tube that will be used to collect the culture medium. One hole 
connects to vacuum, the other to the filter support. The filter sits on top of the glass (or 
metal) frit, with the open-bottom graduated cylinder attached by a clamp. The filter 
must cover the entire frit, or otherwise cells will be lost during filtration and quantita- 
tion will be unreliable. To initiate filtration, cells are poured into the graduated cylinder 
at the top of the apparatus. Once filtration is complete, the clamp is removed and the 
filter quickly transferred to the quenching solvent. 



3. Spin down at ~2000X£ in centrifuge cooled to 

4. Discard supernatant. 

5. For subsequent steps, see Section 6.2. 



— 10 °C for 5 min. 



5.4. Filter culture 

Yeast readily form colonies on agar plates. Such colonies involve a hetero- 
geneous mixture of cells experiencing different nutrient environments. 
Moreover, quantitative sampling of them is difficult. Accordingly, this 
culture approach is not well suited to metabolomic studies. A variant, 
however, can be useful: growing yeast on the surface of a filter atop an 
agarose-medium support. In the filter culture technique, a modest number 
of cells is spread evenly across the filter surface, resulting in almost all cells 
having direct contact with both the filter (and thus the underlying nutrients) 
and the atmosphere. The number of cells is selected to cover ~ 10—20% of 
the filter surface. This avoids inhomogeneities in nutrient access, and 
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enables cells to replicate on the filter for several doublings at a rate similar to 
their growth in comparable liquid medium. 

The principal virtue of this technique, compared to liquid batch culture, is 
the ability to quickly manipulate the culture's environment. For example, 
filter culture enables rapid induction of nutrient starvation (by switching to a 
plate lacking one medium component). It also allows rapid replacement of 
unlabeled with labeled nutrient, as required for kinetic flux profiling (Yuan 
et al., 2008). (Note: The filter culture approach also works well for bacteria 
including E. coli.) 

5.5. Protocol for growth of yeast on filters atop agarose 
support 

1. Prepare 2 X concentrate of minimal media of interest. 

2. Autoclave 30 g/1 of triply washed ultrapure agarose in purified water. 

3. Mix in 1:1 ratio to yield desired minimal media containing 15 g/1 of 
agarose. Equilibrate temperature to ~55 °C in a water bath. 

4. Pour 10 ml of mixture per 60 mm Petri dish. Best results are obtained if 
plates are used within 48 h after pouring. 

5. Grow yeast to ~1 X 10 cells/ml in liquid batch culture. 

6. Place a nylon filter (47 mm diameter, 0.45 /im pore size) on top of a 
filtration apparatus, under weak vacuum. 

7. Using a 2-ml pipette spread 1.6 ml of cell culture evenly across the filter 
surface. Steadying the pipette with a second hand will make distributing 
the cells evenly easier. The vacuum is on the correct intensity if the 
culture pools very briefly before the medium is pulled through the filter. 

8. Place filters on individual agarose plates with cells facing away from the 
agarose surface. Use of broad head forceps (Millipore/#XX6200006) 
will reduce risk of puncturing the filter. 

9. Allow cells to grow on the filter surface for ~2.5 h. After this time the 
cultures are generally ready for perturbation or extraction. 

For further details on filter culture, see Bennett et al. (2008). 

6. Metabolite Extraction 

Once metabolism is quenched, the next step is harvesting metabolites 
from the yeast. The goal is to obtain the most complete extraction possible, 
while avoiding degradation of metabolites, including conversion of one 
metabolite into another. An early approach involved freeze-drying the 
quenched cells, grinding the frozen cell pellet with a mortar and pestle, and 
extracting using strong acid or base in water (Saez and Lagunas, 1976). Later, 
the grinding steps were eliminated and replaced with a single-step extraction 
into boiling ethanol (Castrillo et al, 2003; Gonzalez et al, 1997). While this 
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method performs well for most metabolites, some carboxylic acids such as 
pyruvate and fumarate are lost (Loret et ah, 2007). Based on data from E. coli 
showing the superiority of cold methanol to hot ethanol (Prasad Maharjan and 
Ferenci, 2003), cold methanol was tested in yeast, with generally favorable 
results (Villas-Boas et ah, 2005). Based on our finding that acetonitrile: me tha- 
nohwater mixtures extract nucleotide triphosphates from E. coli much better 
than methanol: water alone (Rabinowitz and Kimball, 2007), we have tested 
acetonitrile:methanol:water (40:40:20, v/v/v) also for yeast (Fig. 16.2 ). For 
most compounds, results are similar for both mixtures. For nucleotide tripho- 
sphates, however, we find a large improvement with acetonitrile, but only for 
cells harvested by filtration, not methanol quenching. Our interpretation is that 
initial exposure of the cells to methanol (in the absence of acetonitrile) results in 
irreversible triphosphates losses. Note that acetonitrile can degrade nitrocellu- 
lose; accordingly, if extracting using acetonitrile, use nylon filters. 

To keep quenching solvent and extraction mixtures cold, we have found 
it convenient to use a cold metal surface. An easy way to maintain such a 
surface is to fill a 13 in. X 9 in. X 2 in. baking tray with gel-packs. With the 
gel-pack sitting in the bottom of the tray, overflow the remainder with 
paper towels. Using packing tape, tightly compress paper towels and ice 
packs into the baking tray. The tray can then be frozen. After freezing, 
invert the tray. The tray bottom (now facing upward) provides a useful cold 
working surface. We find it helpful to build a border (e.g., using paper 
towels and tape) around the bottom of the tray (the cold working surface) to 
prevent Petri dishes from slipping off. 
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Figure 16.2 Metabolite yields for two cell harvesting methods extracted with two 
different solvent mixtures. Arginine yield is insensitive to the harvesting and extraction 
method, whereas UTP yield is increased when cells are harvested by vacuum filtration 
and extracted with acetonitrile rmethanol: water mixture. Error bars reflect standard 
error of N = 3 experimental replicates. 
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6.1. Extraction protocol for cells on filters (from vacuum 
filtration of liquid culture, or from filter cultures) 

1. Prepare extraction solution: by volume, 40% acetonitrile, 40% metha- 
nol, and 20% water. All solvents should be highest purity available 
(minimum HPLC grade). 

2. Cool extraction solution to —20 °C. Shake to mix prior to use. 

3. When ready to extract, pipette cold extraction solvent (—20 °C) into 
60 mm Petri dish (1 ml extraction solvent/ dish). Store at —20 °C until 
use, avoiding prolonged storage (e.g., > 1 h) due to risk of water 
condensation in the solvent. 

4. Place filter, cell-side down, into the cold extraction solvent. 

5. Allow extraction to proceed (stirring not required) at —20 °C for 
15 min. 

6. Collect as much solvent, cells, and debris as possible into a 1.5-ml 
Eppendorf tube. To remove cells and debris from the filter, flip it cell- 
side up and wash 10 times with the pooled solvent at the bottom of the 
Petri dish. To release adherent solvent from the filter, dab it on a dry 
part of the Petri dish approximately five times. From this point forward 
the sample can be kept on ice. 

7. Spin down the mixture in a microcentrifuge (highest speed, 
~ 1 6,000 x^) for 5 min to pellet cell debris. 

8. Transfer the supernatant to a separate 1.5-ml Eppendorf tube and resus- 
pend the remaining pellet in 100 jA fresh extraction solvent. The pellet is 
typically difficult to resuspend. Prefilling the pipette tip with the 100 jA 
of extraction solvent and gently perturbing the pellet with the pipette tip 
before depressing to release the extraction solvent can aid in resuspen- 
sion. Take care not to clog the pipette tip with sticky cell debris. 

9. Keep the resuspended mixture on ice for additional 15 min. 

10. Spin down sample and pool the supernatant with the prior fraction. 

11. Vortex pooled mixture and analyze. 



6.2. Extraction protocol for cells after methanol quenching 

1. Prepare extraction solution: by volume, 40% acetonitrile, 40% metha- 
nol, and 20% water, or alternatively 80% methanol and 20% water 
(both give roughly equivalent results for methanol-quenched cells). All 
solvents should be highest purity available (minimum HPLC grade) . 

2. Cool extraction solution to —20 °C. Shake to mix prior to use. 

3. When ready to extract (i.e., after pouring off supernatant following the 
initial quenching step), pipette 400 jA cold extraction solvent (— 20 °C) 
directly onto pellet. 

4. After pipetting up and down (do not vortex), transfer mixture to a 
1.5-ml Eppendorf tube and let sit on ice for 15 min. 
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5. Spin down the mixture in a microcentrifuge (highest speed, 
~ 1 6,000 X^) for 5 min to pellet cell debris. 

6. Transfer the supernatant to a separate 1.5-ml Eppendorf tube, recording 
the volume recovered. 

7. Resuspend the pellet in a volume of extraction solvent equal to the 
difference of the recovered supernatant and 800 [A. The pellet is 
typically difficult to resuspend. Prefilling the pipette tip with the 
100 jA of extraction solvent and gently perturbing the pellet with the 
pipette tip before depressing to release the extraction solvent can aid in 
resuspension. Take care not to clog the pipette tip with sticky cell 
debris. 

8. Keep the resuspended mixture on ice for additional 15 min. 

9. Spin down sample and pool the supernatant with the prior fraction. 
10. Vortex pooled mixture and analyze. 




7. Chemical Derivatization of Metabolites 

While LC— ESI-MS enables measurement of most metabolites directly, 
separation and/or ionization of some compounds can be enhanced by chemical 
derivatization. Many useful derivatization procedures are available in the 
literature (Carlson and Cravatt, 2007a,b; Dettmer et al, 2007; Lamos et al, 
2007; Shortreed et al, 2006; van der Werf et al, 2007). Here we highlight two 
examples: amino acids and thiol-containing compounds. For amino acids, the 
goal of derivatization is to enhance retention on reversed-phase chromatogra- 
phy. We do this by reaction of the amine to generate a carboxybenzyl, or Cbz, 
derivative (Scheme 16.1). Such derivatized amino acids ionize preferentially in 
negative mode due to their free carboxylic acid moieties. For thiol-containing 
compounds, the goal of derivatization is to convert both free thiols and 
disulfides to a common form that is readily measured by LC— ESI-MS. To this 
end, we use a reagent that reacts with both free thiols and disulfides, methyl 
methanethiosulfonate (Scheme 16.2). This reaction sacrifices information 
regarding cellular thiol oxidation state, with the benefit of avoiding analytical 
complexities due to disulfide formation during extraction or sample storage. 
Relevant protocols are provided in the following sections. 




Reaction Scheme 16.1 Generation of carboxybenzyl (Cbz)-derivatized amino acids. 
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Reaction Scheme 16.2 Generation of methyl disulfide-derivatized thiols. 



7.1. Amino acid derivatization 

1. Add 4 fA triethylamine to 95 fA of metabolome extract. 

2. Vortex to mix. 

3. Add 1 fA benzylchloroformate. 

4. Vortex to mix. 

5. Analyze. 



7.2. Thiol and disulfide derivatization 

Derivatization stock preparation: 

1. Add 795 fA of ultrapure water to a 1.5-ml Eppendorf tube. 

2. Add 200 fA of 1 M ammonium acetate. 

3. Vortex to mix. 

4. Add 4.7 /il methyl methanethiosulfonate. 

Derivatization: 

5. Add 10 fA of derivatization stock to a 90- fA of metabolome extract. 

6. Vortex and let sit for at least an hour at 4 °C in the dark. 

7. Analyze. 




8. LC-MS for Mixture Analysis 

Metabolite extracts are chemically complex mixtures. The leading 
techniques for complex mixture analysis are gas chromatography— MS 
(GC-MS) (Fiehn, 2008), LC-MS (Dunn WB 2008), and NMR (Lenz 
et al., 2005; Lewis et al., 2007; Mashego et al., 2007). One key advantage 
of the MS-based techniques is more sensitive detection of low abundance 
species. While GC— MS provides unsurpassed measurement of volatile ana- 
lytes, it cannot readily detect unstable metabolites like ATP and NADH, as 
they decompose, rather than vaporize, upon heating (Buscher et al., 2009). 
These chemical species sit at the center of yeast metabolism. As such, 
LC— MS has become a first-line approach for yeast metabolomics. 
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One of the key issues in LC— MS analysis is the interface between LC 
(where the analytes are in solution) and MS (where the analytes must be in 
the gas phase). Analytes need to be ionized in order to be detected in the 
mass spectrometer. The two leading methods of converting LC output into 
gas-phase ions are atmospheric pressure chemical ionization (APCI) and 
electrospray ionization (ESI). For small molecule analytes containing func- 
tional groups that are charged in solution (i.e., most metabolites), ESI 
generally provides the more efficient ionization (Kostiainen and Kauppila, 
2009). ESI is, however, a competitive process, wherein an abundant, 
strongly ionizing species in solution can suppress ionization of a less abun- 
dant one, a problem known as "ion suppression." Minimizing ion suppres- 
sion is critical for effective detection of low abundance analytes and for 
avoiding quantitative artifacts, where the signal for a species rises or falls, due 
not to a change in its own concentration, but to a change in the concentra- 
tion of a coeluting, ion-suppressing species. Such quantitative artifacts due 
to ion suppression are among the most likely causes of major errors in 
LC— ESI-MS analysis. To minimize ion suppression, high-quality LC sepa- 
ration is critical. Performance of an LC— ESI-MS method therefore depends 
strongly on the integrated functioning of the three steps: chromatographic 
separation, ESI, and mass spectrometry detection (Fig. 16.3). The following 
sections will examine each step in turn. 



9. Liquid Chromatography 



Metabolite extracts contain a diversity of small molecules that differ in 
their physical chemical properties of size, polarity/hydrophobicity, and 
charge. As such, no LC method is ideal for all classes of metabolites. 
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Figure 16.3 Core components of LC-MS. Mass spectrometers schematically depicted 
in (C) are (a) triple quadrupole, (b) time-of-flight, and (c) Orbitrap. 
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While various LC methods exist in the literature, most of them deal with 
only a limited set of analytes. Nevertheless, progress has been made in recent 
years towards comprehensive metabolite separation (van der Werf et al., 
2007). Use of two or more complementary LC methods increases the 
fraction of the metabolome that can be reliably detected and quantified 
(Sabatine et al., 2005). 

In general, the most robust LC approach to small molecule separation 
involves reversed-phase chromatography using a nonpolar stationary phase, 
for example, CI 8 RP-HPLC. Gradients begin with high water content and 
gradually add methanol or acetonitrile to elute hydrophobic compounds. 
Polar molecules elute earlier and nonpolar molecules later. Although CI 8 
RP-HPLC is a good starting point for metabolome analysis (Trauger et al., 
2008), many polar metabolites do not retain adequately, eluting near the 
void volume during the beginning of the chromatographic run. In addition, 
nucleotide triphosphate compounds like ATP often do not elute as well- 
defined peaks. These problems tend to be especially acute if the metabo- 
lome extract contains organic solvent. 

To mitigate this problem, one can strip all organic solvent from the 
metabolome extract by evaporation, for example, drying under nitrogen 
flow, vacuum centrifugation, or freeze drying. The dried sample is then 
redissolved in pure water, resulting in improved chromatographic perfor- 
mance. The downsides are risks of sample alteration: labile metabolites may 
degrade during sample concentration (a particular concern as reactions 
between metabolites accelerate as sample concentration increases); more 
hydrophobic metabolites may fail to redissolve in pure water; and refolding 
and thereby reactivation of some enzymes may occur when the sample is 
redissolved in water (these enzymes may then trigger metabolic reactions). 
A recent exploration of yeast extraction methods included examination of 
metabolite loss due to lyophilization (Villas-Boas et al., 2005). Loss of 
certain metabolite classes, including lipids, nucleotides, and basic amino 
acids, was observed. 

Another approach is to enhance retention of polar analytes using an ion- 
pairing agent: a volatile charged compound that pairs with oppositely 
charged analytes in solution, resulting in an ion— ion complex. The ion- 
pairing agent contains hydrophobic moieties that enhance binding of the 
ion— ion complex to the CI 8 column. The ion-pairing agent is typically 
added only to the aqueous mobile phase. As the organic fraction increases 
during the LC gradient, the concentration of ion-pairing agent drops, 
favoring effective column elution. Due to its volatility, the pairing agent 
dissociates into the gas phase during ESI. 

To date, metabolomic studies have used amine-containing ion-pairing 
agents coupled with negative mode ESI (Coulier et al., 2006; Lu et al., 2008; 
Luo et al., 2007). In our lab, we use reversed-phase ion-pairing chromatog- 
raphy with tributylamine as the pairing agent as our first-line metabolome 
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analysis approach. This LC approach provides reliable separation of a broad 
range of negatively charged metabolites, including constituents of glycoly- 
sis, the pentose phosphate pathway, and the tricarboxylic acid, as well as 
nucleotides and Cbz-derivatized amino acids. 

Due to their cationic nature, amine-based ion-pairing agents will cause 
ion suppression in positive ion mode. Therefore, they are not suitable for 
the analysis of positively charged metabolites such as underivatized amino 
acids. Direct analysis of amino acids and other highly polar metabolites can 
be achieved using an alternative chromatography approach: hydrophilic 
interaction chromatography (HILIC), which involves a polar stationary 
phase (Tolstikov and Fiehn, 2002). Gradients begin with high acetonitrile 
content and gradually add water to elute polar compounds. Nonpolar 
molecules elute earlier and polar molecules later. A general challenge with 
HILIC is retention time variability, as compound retention is often 
impaired by salts or other components of the biological matrix. Accord- 
ingly, for HILIC methods, it is especially important to judge results based on 
repeated running of actual samples. 

The HILIC method that we commonly use involves an aminopropyl 
stationary phase, which retains metabolites through both hydrogen bonding 
and ionic interactions (Bajad et ah, 2006). At the running pH of 9.45, 
approximately half of the amino groups on the column are protonated. 
The method separates a broad range of metabolites including amino acids, 
nucleosides, nucleotides, coenzyme A derivatives, carboxylic acids, and 
sugar phosphates. Among HILIC methods, it is relatively robust to the 
sample matrix. An advantage of this approach is its compatibility with 
both positive and negative ESI. Accordingly, it is an attractive analytical 
tool when only a single LC— MS system is available. 

In our labs, we routinely apply both HILIC and CI 8 RPIP-HPLC 
methods for yeast metabolome analysis. Cellular extracts are aliquoted 
into two different HPLC autosampler vials and run separately on two 
LC— MS systems. The first uses HILIC chromatography with positive 
mode ionization, the second involves reversed-phase chromatography 
with tributylamine as an ion-pairing agent with negative mode ionization. 
As ion-pairing agents can be difficult to wash out of LC— MS systems 
completely, it is not convenient to run both of these methods in alternating 
fashion on a single LC— MS system; a dedicated system for the ion-pairing 
chromatography is preferred. For chromatographic details, see Table 16.1. 

Looking forward, we anticipate that standard HPLC methods, such as 
those described above, will be replaced by related methods involving 
smaller particle-size columns (e.g., <2 Jim instead of ~5 /im) and higher 
pressures (e.g., >400 bar, instead of ~ 200 bar). In such ultra-performance 
liquid chromatography (UPLC), the smaller particle size results in greater 
particle surface area, and thereby faster equilibration of analytes with the 
column. This in turn enables faster column elution without peak tailing. 
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Table 16.1 Two complementary LC methods 



Approach 


HILIC 


CI 8 RPIP-HPLC 


ESI 


Positive 


Negative 


Column 


Luna NH 2 column 


Synergi Hydro column 




(5 /im particle size, 


(4 fim particle size, 




250 mm X 2 mm, from 


150 mm X 2 mm, from 




Phenomenex, Torrance, 


Phenomenex, Torrance, 




CA) 


CA) 


Solvent A 


20 mM ammonium 


10 mM 




acetate + 20 mM 


tributylamine +15 mM 




ammonium hydroxide 


acetic acid in 97:3 water: 




in 95:5 water: 


methanol 




acetonitrile, pH 9.45 




Solvent B 


Acetonitrile 


Methanol 


Flow rate 


150 jA/ min 


200 lA/m'm 


Running 


40 min 


50 min 


time 






Gradient 


0, 15, 28, 30, 40 


0, 5, 10, 20, 35, 38, 42, 43, 50 


time 






%B 


85, 0, 0, 85, 85 


0, 0, 20, 20, 65, 95, 95, 0, 



The result is increased chromatographic resolution: chromatographic peaks 
are narrower with higher signal-to-noise (S/N) ratios (Dunn et ah, 2008; 
Nguyen et ah, 2006; Wilson et ah, 2005). Sample running time is generally 
shorter, increasing throughput. The fast elution of analytes requires faster 
MS scanning, an area in which MS systems continue to improve. 




10. Electrospray Ionization 



The past 20 years has seen a dramatic rise in use of LC— MS as an 
analytical technique due in large part to the advent of ESI: applying a strong 
voltage to the liquid stream exiting the tip of a needle. This seemingly 
simple trick enables efficient conversion of charged molecules from the 
liquid phase into gas-phase ions. These ions can subsequently be analyzed by 
MS. The physical mechanisms of ESI remain only partially understood. 
Charged droplets are initially produced by electrostatic dispersion when 
liquid emerges from the tip of the metal needle (Nguyen and Fenn, 2007). 
Solvent then evaporates from the charged droplets. As the droplets become 
smaller and smaller, the like-charged ions within them repel due to 
coulombic forces, eventually resulting in release of gas-phase ions. 
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A major pitfall of ESI is the competitive nature of ion formation. If too 
many ions are present during their expulsion from the charged capillary, ion 
production will not increase linearly with concentration. This results in 
concentration underestimation. No undisputed method eliminating "ion 
suppression" exists. Instead, one needs to determine the extent of ion 
suppression and correct for it. This is best done by using isotopic internal 
standard, which will experience identical ion suppression to the analyte of 
interest. When isotopic standards are not available (as is typical in metabo- 
lome analysis), a simple alternative involves serial dilution of the sample. 
A linear response suggests the absence of ion suppression, while a strongly 
nonlinear one points to a problem. 

Frequently in metabolome analysis, ion suppression is a major problem, 
but only during a particular LC retention time window when many species 
coelute, or when salt from the sample comes off the column. In such cases, it 
is especially useful to include isotopic standards for the analytes eluting 
during the problematic chromatographic window. If such standards are 
not available, it may be necessary to analyze a diluted sample, to enable 
rough quantitation of compounds during the problematic LC interval. 




11. Mass Spectrometry 

The ability to resolve and quantitate ions by MS provides a powerful 
tool for investigating yeast metabolism. Particularly, useful MS approaches 
for metabolite analysis are triple quadrupole MS in multiple reaction moni- 
toring (MRM) mode, high resolution MS in full scan mode, and hybrid MS 
in data-dependent MS/MS mode. The essential features of these approaches 
are described in the following sections. The best approach depends on the 
particular experimental objectives. 



11.1. Triple quadrupole mass spectrometers 

Quadrupoles function as mass filters. At any instant, they allow a particular 
ml z to pass through, sending all other ions to waste. Mass resolution is 
"low," meaning that, although ions differing by 1 amu are essentially 
completely separated, those differing by <0.1 amu are not. Due to this 
low mass resolution, for complex mixtures, a single mass filtration step 
(single quadrupole MS) is often insufficient to isolate the analyte of interest. 
Triple quadrupole MS increases specificity by using two mass filtration steps 
in series, separated by a collision cell. The first quadrupole selects for the 
parent ml z. The second quadrupole serves as the collision cell, where ions 
selected in the first quadrupole are collided with a noble gas (e.g., argon) to 
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produce fragment ions. The third quadrupole selects for the fragment m/z. 
These steps remove most environmental interferences. With careful selec- 
tion of fragment ions, they are often sufficient to distinguish closely related 
metabolites. 

Effective utilization of a triple quadrupole mass spectrometer requires 
optimization of two parameters for each targeted metabolite: the fragment 
ion to select in the third quadrupole, and the collision energy to produce 
that fragment in the collision cell. These parameters are best determined 
using compound standards. Once the parameters are determined, each 
compound is measured via a "selected reaction monitoring" (SRM) scan 
event, where the "selected reaction" refers to the parent ion forming a 
specific fragment ion at a defined collision energy in the MS instrument's 
collision cell. In metabolomics, SRM scan events for different compounds 
are performed in series, an approach called MRM. MRM can also refer to 
monitoring production of multiple different product ions for a single parent 
ion. The main advantages of MRM are sensitivity and linear dynamic range 
(which result from the efficiency of quadrupoles and ion detectors) and 
specificity (due to the two MS steps). There are two major disadvantages: 
data is limited to targeted analytes and sensitivity and quantitative precision 
fall when analyzing large numbers of compounds, as the scan time becomes 
divided over many SRM events. To mitigate the latter problem, MRM 
scans for a given metabolite are typically performed only during the reten- 
tion time interval in which the compound elutes. 

Performance of an MRM-based method depends strongly on selecting 
the optimal product ion. This requires striking a balance between signal 
intensity (which is optimized by choosing the most abundant product ion) 
and specificity (which involves picking as unique a product ion as possible). 
Generally, it is desirable to avoid product ions that are formed by loss of 
H 2 or NH 3 , as such losses occur for a wide variety of parent ion structures. 
MS spectra of glucose- 1 -phosphate and glucose-6-phosphate can be seen 
in Fig. 16.4. For glucose-6-phosphate, a fragment ion of 199 m/z is specific, 
so one may use the SRM 259 — > 199 to differentiate it from glucose- 1- 
phosphate. This nomenclature describes fragmenting the parent ion of 259 
m/z and selecting a product fragment ion of 199 m/z for quantification. On 
the other hand, no specific product ion was observed for glucose- 1 -phos- 
phate (fragment ions 79, 97, 139, and 241 m/z were also seen from glucose- 
6-phosphate). In this case, separation in a second dimension (such as HPLC) 
would be needed to differentiate signal resulting from glucose- 1 -phosphate 
from that resulting from glucose-6-phosphate. An alternative is to subtract 
the signal at fragment ion 79 m/z arising from glucose-6-phosphate (whose 
concentration is estimated based on the specific product ion of 199 m/z) 
from the total signal, and to assign the residual signal to glucose- 1 -phosphate 
or other hexose phosphate isomers. 
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Figure 16.4 MS spectra of glucose-1 -phosphate and glucose-6-phosphate in negative 
ion mode. 



11.2. High-resolution mass analyzers 

Resolution refers to the ability to separate compounds based on a small mass 
differences. Resolution can be quantified based on Am, the smallest mass 
difference that can be resolved. As Am typically increases with analyte mass 
(m), resolution more often reported as m/Am. 

A common and cost-effective means of gathering high mass resolution 
data is TOF MS. Each TOF scan involves measuring the time that ions, 
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which have been accelerated through a fixed voltage, take to traverse a flight 
path. Ions of low ml z fly faster, and thus reach the detector earlier. Modern 
TOF instruments provide ~ 10,000 ml Am, which is adequate to differenti- 
ate, for example, protonated lysine (C 6 H 15 N 2 2 , 147.1128 ml z) and 
protonated glutamine (C 5 H 11 N 2 3 + , 147.0765 m/z), with ml Am ~ 4000. 
Resolution of commercial TOF instruments is expected to increase further in 
the near future, perhaps to ~ 50,000 ml Am. 

A Fourier transform ion cyclotron resonance (FTICR) mass spectrome- 
ter provides yet higher mass resolution (y 100,000—1,000,000 ml Am) 
(Breitling et ah, 2006). It involves the circular motion of ions in a magnetic 
field, where the frequency of rotation relates to the ion's ml z. Its key 
disadvantage is its very high cost. Orbitrap mass analyzers involve some 
similar principles to ion cyclotron resonance, but without the need for 
a magnet. Ions circulate around a central, roughly cylindrical electrode 
(depicted schematically in Fig. 16.3), with the attractive electric field from 
the electrode providing the required centripetal force. This is roughly 
analogous to the attractive gravitational field from the sun proving the 
required centripetal force to hold the planets in orbit. However, the central 
electrode of the Orbitrap is not actually cylindrical. Instead, it is shaped so 
as to result in the ions oscillating along the electrode's long axis. The 
motion parallel to the electrode's axis depends on the ion's ml z, and the 
measured oscillation frequency can be Fourier deconvoluted to reveal 
ion mass with up to ~ 350,000 resolution (Makarov et ah, 2009), with 
~ 100,000 resolution reliably obtained using commercially available instru- 
ments. The resolving power of ~ 100,000 has several uses; for example, 
one can differentiate singly C- and singly N-labeled glutamine 
( 12 C 4 13 CH 11 14 N 2 3 + , 148.0803 mlz, vs. 12 C 5 H 11 14 N 15 N0 3 + , 148.0740 
mlz, ml Am ~ 24,000), which is helpful in following isotope tracers. Note 
that the different masses of the C- and N-labeled forms reflects the mass 
difference for a neutron in different nuclei; that is, different nuclear energies 
result in measurable mass differences. 

Among these mass analyzers, we consider both TOF and Orbitrap 
reasonable choices for typical metabolomics users. In our hands, both 
offered similar sensitivity, linear dynamic range, and quantitative reproduc- 
ibility, with the Orbitrap providing greater specificity due to its higher 
resolving power. Note that we use "Orbitrap" to refer to the Orbitrap 
mass analyzer, not the hybrid ion trap-Orbitrap MS /MS instrument that is 
widely used in proteomics and discussed in Section 11.3. In addition to 
being a component of this hybrid instrument, the Orbitrap is currently sold 
as a stand-alone mass analyzer by Thermo under the brand name "Exactive" 
at prices roughly comparable to high-quality TOF instruments. Irrespective 
of the instrument employed, the main data output of high-resolution 
MS are full scan MS spectra, collected typically at a rate of ^ 1 spectrum/s. 
With appropriate software (discussed below), the data can be used both to 
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quantitate known compounds and to identify other compounds that vary 
across biological conditions. 



11.3. Hybrid instruments 

Hybrid instruments involve the combination of two or more mass analyzers 
within a single instrument. Typically, this involves a low-resolution ana- 
lyzer that filters ions on the front end (e.g., quadrupole or ion trap) and a 
high-resolution mass analyzer on the back end (e.g., TOF or Orbitrap). At 
present, the most commercially important hybrid instruments are quadru- 
pole-TOF (Q-TOF) and linear ion trap-Orbitrap. These instruments pro- 
vide full scan MS capabilities similar to stand-alone high-resolution mass 
analyzers, although typically with some decrement in sensitivity due to ion 
losses in the front end. They also enable MS/MS analysis: ions can be 
selected and fragmented in the front end, and the fragment ions passed to 
the back end for high resolution analysis. 

An important capability of hybrid instruments is data-dependent MS/ 
MS analysis: collecting a full scan MS spectrum, and then running MS/MS 
on the most prevalent ion(s) present in the full scan. (Such data-dependent 
analysis can also be run on a standard ion trap instrument, but without the 
benefits of high resolution). Typically, the full scan data is used for quanti- 
tation, with the MS/MS data used for compound identification. This 
approach is well-proven in proteomics, where MS/MS spectra are searched 
against genome sequences to identify peptides. In metabolomics, the need 
for MS/MS is arguably less, as the combination of accurate mass and 
retention time is often adequate to identify known analytes. For unknown 
analytes, while MS/MS data is a valuable first step in identification, it is not 
typically sufficient. Thus, while it is now routine practice to use data- 
dependent MS/MS for proteomics applications (Yates et ah, 2009), we do 
not routinely use this approach for metabolomics, which involves additional 
data management challenges and higher MS instrument cost. Others, how- 
ever, have been successful through routine use of MS/MS (Lawton et ah, 
2009). 




12. Targeted Data Analysis 

Raw LC— MS data involves the three dimensions of retention time, mlz, 
and signal intensity (Fig. 16.5). In triple quadrupole data, the ml z domain 
is discrete: MRM scan events. In high-resolution mass spectrometry data, 
the ml z domain is continuous if collected in "profile" mode and 
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Figure 16.5 Chromatograms and mass spectra are two-dimensional slices of 3D LC- 
MS data. The upper 3D plot was generated using a small subset of data collected from 
LC-MS analysis of a yeast extract on an Orbitrap mass analyzer in negative ionization 
mode. The plot is centered around the anion formed by deprotonation of nicotinamide 
adenine dinucleotide phosphate, oxidized form: [NADP-H] (m/z = 742.069, 
RT = 30 min). Four distinct peaks can be observed. The peak at m/z = 743.077, 
RT = 30 min is the C M + 1 isotopic variant of [NADP-H] . The identities of for 
the other two peaks are not known. Chromatograms are generally used for quantita- 
tion, while mass spectra are used for the identification of coeluting compounds, frag- 
ments, adducts, and isotopic variants. 



semicontinuous if collected in "centroid" mode, where the instrument 
records the central m/z for each mass spectral peak, instead of its full 
distribution. 

Targeted metabolomics data are typically visualized as ion-specific chro- 
matograms: signal plotted versus retention time for a specified m/z, that is, 
for a specific MRM scan in triple quadrupole data or for an m/z range 
centered about the m/z of the targeted metabolite in high-resolution full 



Yeast Metabolomics 417 

scan MS data (Fig. 16.5). The main goal of targeted metabolomics is to 
quantitate known compounds. This involves reducing ion-specific chro- 
matograms to a scalar value proportional to the compound concentration. 
There are two main requirements: finding the correct peak and quantitating 
the peak intensity. 

If several peaks are present in an ion-specific chromatogram, selecting 
the correct peak for a targeted analyte requires prior knowledge of the 
compound's retention time. A challenge is that retention time can vary from 
sample-to-sample (e.g., due to differences in sample matrix, environmental 
factors, or column aging). Accordingly, it is desirable to align chromato- 
grams prior to looking for the peak corresponding to the analyte of interest. 
Alignment relies on the fact that retention time shifts tend to be consistent 
across compounds. A number of computational algorithms for alignment 
that have been developed (Lange et ah, 2008; Smith et ah, 2006). Once 
samples are aligned, peaks eluting at the same time are grouped. Peaks are 
then quantitated, and the peak size for each group in each sample reported. 
This yields a matrix with peaks as rows, samples as columns, and peak 
intensities as the entries. 

Peak intensity can be measured a number of ways, with the two most 
common being peak height or peak area. For a fixed-width Gaussian peak, 
peak height and area are linearly related; thus both approaches are equiva- 
lent. If peak width varies (i.e., is broader in one sample than another), then 
area is the more reliable metric. However, calculations of area depend on 
where one draws the peak boundaries and baseline. Accordingly, if S/N is 
poor, peak height can be a more robust metric. A compromise that we 
sometimes employ involves summing the highest few points in a peak. 

We have developed an open-source software package that automates the 
analysis of targeted metabolomics data (Melamud, 2009). The package, 
"mzROCK," is written in the statistical software package R and freely 
available at http://code.google.eom/p/mzrock/. It is designed for MRM 
methods, but the concepts (and most of the code) apply to any targeted 
LC— MS data. The package includes features that help to cope with the 
imperfections of real LC— MS data. For example, it reports not only peak 
sizes, but also a statistical estimate of the peak quality. The estimate of peak 
quality takes into account factors like retention time consistency and S/N, 
integrating them via a random forest algorithm. The peak quality estimate is 
a valuable guide as to peaks to ignore (those with very low quality scores) 
and those most in need of careful manual review (those with intermediate 
scores). To facilitate manual data review, the mzROCK software package 
produces a PDF file that presents all raw ion-specific chromatograms in any 
easy-to-read, searchable manner. Manual review of raw ion-specific chro- 
matograms is standard practice in LC— MS labs. We encourage yeast biolo- 
gists working at the interface with metabolomics to devote some time to 
examining ion-specific chromatograms. 
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13. Untargeted Data Analysis 

A major advantage of high-resolution mass spectrometers is the ability 
to quantitate known analytes while simultaneously collecting untargeted 
data on all ions present, including ones arising from novel compounds. 
Although involving greater data management challenges due to larger file 
sizes, untargeted data analysis involves the same basic steps as targeted data 
analysis, except with the added complication of selecting which mlz are 
interesting. The motivation to focus on specific ml z reflects the otherwise 
excessive number of ion-specific chromatograms generated by current 
high-resolution instruments. For example, for ml z < 1000, an Orbitrap 
mass analyzer can reliably distinguish analytes differing by <0.01 mlz, 
resulting in > 100,000 differentiable mass slices. Most of these mass slices 
are empty, however, as no chemically feasible combination of common 
elements generates analytes of that mlz. In addition, some mass slices 
contain only noise. Eliminating such slices tends to yield a manageable 
number of ion-specific chromatograms, from which peaks can be identified, 
aligned, and quantitated. As for targeted analysis, the net outcome is a 
matrix with peaks as rows, samples as columns, and peak intensities as the 
entries. The difference is that peaks do not necessarily correspond to known 
metabolites, but may instead be identified solely based on mlz and retention 
time. The Siuzdak lab has developed software called "XCMS" that effec- 
tively conducts the above steps (Smith et ah, 2006). XCMS accepts data in 
the common open-exchange format "mzXML" (Pedrioli et al, 2004); 
other open formats are "mzData" (Orchard et al, 2004) and "mzML" 
(Deutsch, 2008). Moving to a common open-exchange format will facilitate 
sharing of metabolomics data going forward. 

MS/MS data can contribute to untargeted data analysis in two ways. 
Peaks with identical MS/MS spectra provide useful benchmarks for chro- 
matogram alignment (Benton et al, 2008). Also, MS/MS spectra provide 
additional data for peak identification. 

A complication in untargeted LC— MS is that the number of observed 
peaks may markedly exceed the number of actual metabolites present: signal 
can arise from different adduct ions, isotopic variants, and in-source 
fragmentation. 

13.1. Adducts 

Common adduct ions are listed in Table 16.2. Adduct formation occurs 
during ionization. Accordingly, different adducts of the same metabolite 
coelute chromatographically, which provides a useful means for their identifi- 
cation. In practice, we typically quantitate based on M + H and M — H~ 
peaks, as formation of other adduct ions may vary strongly based on the 
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Table 16.2 Some common adduct species formed by positive and negative mode 
electrospray ionization 





Mol 








Adduct species 


ratio 


Charge 


AM 


Polarity 


[M + H] + 


1 


1 


1.0072 


+ 


[M + NH 4 ] + 


1 


1 


18.0338 


+ 


[M + H 3 0] + 


1 


1 


19.0184 


+ 


[M + Na] + 


1 


1 


22.9899 


+ 


[M + CH3OH + H] + 


1 


1 


33.0335 


+ 


[M + K] + 


1 


1 


38.9632 


+ 


[M + CH 3 CN + H] + 


1 


1 


42.0338 


+ 


[M + 2Na - H] + 


1 


1 


44.9726 


+ 


[M + CH 3 CN + Na] + 


1 


1 


64.0165 


+ 


[M + HCOOH + Na] + 


1 


1 


68.9954 


+ 


[M + 2K - H] + 


1 


1 


76.9191 


+ 


[M + CH3COOH + Na] + 


1 


1 


83.0110 


+ 


[M + 2CH 3 CN + H] + 


1 


1 


83.0603 


+ 


[M + tributylamine + H] + 


1 


1 


186.2222 


+ 


[M + 2H] 2+ 


1 


2 


2.0145 


+ 


[M + H + NH 4 ] 2+ 


1 


2 


19.0410 


+ 


[M + H + Na] 2+ 


1 


2 


23.9971 


+ 


[M + H + K] 2+ 


1 


2 


39.9704 


+ 


[M + CH 3 CN + 2H] 2+ 


1 


2 


43.0410 


+ 


[M + 2Na] 2+ 


1 


2 


45.9798 


+ 


[M + Na + K] 2+ 


1 


2 


61.9531 


+ 


[M + 2K] 2+ 


1 


2 


77.9263 


+ 


[M + 2CH 3 CN + 2H] 2+ 


1 


2 


84.0675 


+ 


[M + 3CH 3 CN + 2H] 2+ 


1 


2 


125.0941 


+ 


[M + tributylamine + 2H] 2+ 


1 


2 


187.2288 


+ 


[M + 3H] 3+ 


1 


3 


3.0217 


+ 


[M + 2H + Na] 3+ 


1 


3 


25.0044 


+ 


[M + H + 2Na] 3+ 


1 


3 


46.9871 


+ 


[M + Fe] 3+ 


1 


3 


55.9344 


+ 


[M + 3Na] 3+ 


1 


3 


68.9698 


+ 


[M + 3K] 3+ 


1 


3 


116.8895 


+ 


[2M + H] + 


2 


1 


1.0072 


+ 


[2M + NH 4 ] + 


2 


1 


18.0338 


+ 


[2M + Na] + 


2 


1 


22.9899 


+ 


[2M + K] + 


2 


1 


38.9632 


+ 


[2M + CH 3 CN + H] + 


2 


1 


42.0338 


+ 


[2M + 3H 2 + 2H] + 


2 


1 


56.0461 


+ 


[2M + CH 3 CN + Na] + 


2 


1 


64.0165 


+ 



(continued) 
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Table 16.2 (continued) 





Mol 






Adduct species 


ratio 


Charge 


AM Polarity 


[M - H]~ 


1 


1 


-1.0072 


[M - OH]" 


1 


1 


17.0067 


[M - H 2 - H]~ 


1 


1 


-20.0250 


[M + Na - 2H]~ 


1 


1 


20.9755 


[M + Na - 2H] 2 ~ 


1 


1 


20.9755 


[M + CH3OH - H]~ 


1 


1 


31.0190 


[M + CI]" 


1 


1 


34.9694 


[M + K + 2H]~ 


1 


1 


36.9487 


[M + CH 3 CN + H]~ 


1 


1 


40.0193 


[M + HCOO]" 


1 


1 


44.9982 


[M + CH 3 COO]~ 


1 


1 


59.0133 


[M + Na0 2 CH - H]~ 


1 


1 


66.9803 


[M + 79 Br]~ 


1 


1 


78.9183 


[M + 81 Br]" 


1 


1 


80.9168 


[M + Na0 2 CCH 3 - H]" 


1 


1 


80.9960 


[M + CF3COO]" 


1 


1 


112.9856 


[M + tributylamine — H]~ 


1 


1 


184.2071 


[M - 2H] 2_ 


1 


2 


-2.0145 


[M - 3H] 3 " 


1 


3 


-3.0217 


[2M - H]" 


2 


1 


-1.0072 


[2M + HCOO]" 


2 


1 


44.9982 


[3M - H]" 


3 


1 


-1.0072 



The relative ratios of the adduct species depend on the analyte concentration, solvent composition, pH, 
and concentrations of other ions like sodium. To calculate adduct m/z given parent m/z use the 
following formula: adduct[m/z] = ((parent [m/z]) x (mol ratio) + AM)/ (charge). 

biological matrix (e.g., sodium concentration in the sample). An alternative 
approach is to identify all adduct ions of a given parent and sum their 
intensities; however, this requires additional computational tools. 



13.2. Isotopic variants 

In the absence of isotopic labeling, the most common isotopic variant is 
C, with the abundance of the C M + 1 peak (S M + a ) given by 

5 M +i Nx 0.011(0.989) iV " 1 

-•0.01 IN 



5m 0.989 N 






where N is the number of carbon atoms in the metabolite and 5 M is the 
intensity of the M + peak. Other isotopic variants that can often be 
detected include N, H, S, etc. 



Yeast Metabolomics 421 

13.3. In-source fragmentation 

With modern ESI sources, in-source fragmentation is less of an issue than 
adduct formation; however, it is still worth being aware of the possibility for 
in-source losses of water, ammonia, carbon dioxide, phosphate, etc. 

13.4. Unknown identification 

Once the myriad of peaks found in untargeted LC— MS analysis are reduced 
to the more finite set of probable underlying metabolites, the question 
becomes determining their identities. A first-pass approach involves search- 
ing of known metabolite databases, such as KEGG, Metacyc, Human 
Metabolome Database, and METLIN for known compounds of the correct 
exact mass (Smith et ah, 2005; Wishart et ah, 2009). When a match is found, 
the question then becomes whether it is a true positive or false positive. 
Three criteria are of immediate assistance: isotopic variants (e.g., if the 
candidate structure contains sulfur, then there should be an ~5% M + 2 
peak); retention time (e.g., if the compound is a monophosphate, it should 
elute close to other related monophosphates); and MS/MS spectrum if 
available. More definitive structure assignment typically requires obtaining 
pure standard (which should have the same MS /MS spectrum as the 
endogenous compound, and coelute by at least two orthogonal chroma- 
tography methods) and/ or purification and NMR analysis. 

When no database match is found, structure elucidation becomes yet 
more challenging. Exact mass information provides a useful starting point 
for determining the compound's molecular formula, and a variety of 
molecular formula calculators, including ones that take into account isoto- 
pic variants present, are available (Jarussophon et ah, 2009; Koch et ah, 
2007; Sana et ah, 2008). Additional information of molecular formula can be 
obtained by feeding C-, N-, or other isotope-labeled nutrients, and 
observing labeling of the unknown compounds of interest. If one obtains 
from such analyses a guess as to the metabolic origin of the novel species, 
feeding more downstream metabolic precursors in labeled form can clarify 
the precise metabolic origin. Throughout the above steps, MS/MS infor- 
mation is quite useful (Ashline et ah, 2007; Kurimoto et ah, 2006). For 
example, one could demonstrate that a novel species contains adenine by 
(1) finding a characteristic adenine fragment ion and (2) feeding labeled 
adenine and observing label incorporation. 



14. Future Outlook 

Yeast metabolomics is still a young field with many advances in 
instrumentation and computation on the horizon. This makes it prime for 
both technical innovation and biological discovery. Much can be learned 
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from the examples of genomics and transcriptomics, in terms of building 
common resources (e.g., SGD) and developing data visualization tools (e.g., 
clustered heat maps). A unique feature of metabolomics is the well-defined 
connections between metabolites: the metabolic map that was in many 
respects the first great achievement in systems biochemistry. The ability to 
conveniently relate data, both metabolomic and fluxomic, to this map is an 
immediate need. Another immediate need is experimental and computa- 
tional tools for flux profiling that are simple and robust enough to be widely 
used throughout the yeast community. Looking further into the future, the 
challenges of metabolomics will likely coalesce with those of systems biol- 
ogy: integration of diverse types of data to gain insight into the biochemical 
mechanisms underlying the complex emergent properties that distinguish 
living systems. 
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Abstract 

Yeast cells in an isogenic population do not all display the same phenotypes. 
To study such variation within a population of cells, we need to perform 
measurements on each individual cell instead of measurements that average 
out the behavior of a cell over the entire population. Here, we provide the basic 
concepts and a step-by-step protocol for a recently developed technique 
enabling one such measurement: fluorescence in situ hybridization that renders 
single mRNA molecule visible in individual fixed cells. 




1. Introduction 

Within an isogenic population of yeast cells, the behavior of any 
individual cell can differ markedly from the average behavior of the popu- 
lation (Raj and van Oudenaarden, 2008). For example, it has been shown 
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that random partitioning of proteins during cell division leads to variability 
in the number of proteins in individual cells (Rosenfeld et al, 2005), while 
random bursts of transcription results in variability in number of mRNAs 
(Chubb et al, 2006; Golding et al, 2005; Raj et al, 2006). These are just a 
few examples that highlight the importance of studying the behavior of a 
single cell rather than that of the whole population. One primary tool for 
studying the behavior of a single cell is the fluorescent protein such as GFP 
(green fluorescent protein). The most straightforward application of a 
fluorescent protein is to have it either driven by the promoter of interest 
or fused to the protein of interest to study variability in gene expression. Yet 
while the use of fluorescent proteins has certainly been pivotal in monitor- 
ing gene expression, fluorescent proteins suffer from a number of limita- 
tions. One such limitation is their low sensitivity: fluorescence from GFP 
and its variants is typically undetectable at the small number of molecules 
involved in studying gene expression. In yeast, fluorescence from GFP is 
typically detectable only when many hundreds of GFPs are present in a cell; 
the abundance of many transcription factors, for example, falls below this 
limit. Since the effects of expression variability are magnified when the 
number of molecules is low, the sensitivity limitation may preclude effective 
study of these processes. Another issue is that it is difficult to quantify the 
exact number of fluorescent proteins in individual cells because it is difficult 
to measure the amount of fluorescence emitted by a single GFP molecule. 
In addition, the slow decay time of fluorescent proteins (due to their 
relatively high stability) means that fluorescence is only diluted by cell 
division but not through other degradation mechanisms. This prevents 
observation of rapidly varying changes in gene activation, effectively 
averaging temporal fluctuations. 

While having a fluorescent protein expressed by the promoter of interest 
or fused to a protein of interest suffers from a number of setbacks, other 
applications of the fluorescent protein led to powerful techniques enabling 
the detection of a single mRNA molecule in a single cell. mRNA of a given 
gene in a single cell has been difficult to detect in the past because each cell 
has very small copy numbers of it at any one time. One such technique is the 
MS2 mRNA detection scheme (Beach et al, 1999; Bertrand et al, 1998). 
One way to implement this technique is to engineer a gene so that its 
mRNA contains 96 copies of a particular RNA hairpin in its untranslated 
region. These hairpins then tightly bind to a coat protein of the bacterio- 
phage MS2. Therefore, by also having a gene expressing the MS2 coat 
protein fused to GFP in the cell, a single mRNA with the 96 copies of RNA 
hairpin will now emit high enough fluorescence to be resolved as a single 
diffraction-limited spot under a fluorescence microscope. This method can 
help measure the transcription of a gene in real-time in a single-cell, as was 
done in Escherichia coli (Golding et al, 2005). Despite the vast improvement 
in resolution the MS2 method provides over conventional methods using 



Imaging Single mRNA Molecules in Yeast 431 

GFP and its variants, it has a disadvantage in that mRNAs tend to aggregate 
together and that the regulation of the endogenous mRNA may change 
(thus one monitors this altered regulation rather than the endogenous one) 
because it has now been engineered to have the long artificial sequence for 
hairpin formation. 

In this chapter, we describe fluorescence in situ hybridization (FISH) 
method (Gall, 1968; Levsky and Singer, 2003) for detecting single endo- 
genous mRNA molecules in individual yeast cells (Raj et ah, 2008). Since 
the target gene sequence does not have to be modified to use this method, it 
bypasses the aforementioned problems associated with engineering the 
mRNA to have hairpin forming sequences in the MS2 mRNA detection 
scheme. It is also highly sensitive and allows for the counting of mRNA 
molecules in single cells, thus obviating many of the issues associated with 
using GFP as either a fusion to a protein of interest or driven by a promoter 
of interest mentioned before. In this method, we utilize a large collection 
(at least 30) of oligonucleotides, each labeled with a single fluorophore, that 
binds along the length of the target mRNA (Fig. 17.1A). The binding of so 
many fluorophores to a single mRNA results in a signal bright enough to be 
detectable with a microscope as a diffraction-limited spot. The method we 
describe is a modification of the RNA FISH method described by Singer 
and coworkers (Femino et ah, 1998), in which the authors use a smaller 
number (~5) of longer oligonucleotides (~50 bp), each of which contains 
up to five fluorophores (Fig. 17. IB). While that method has been used 
successfully to count mRNAs in single cells (Long et ah, 1997; Maamar 
et ah, 2007; Sindelar and Jaklevic, 1995; Zenklusen et ah, 2007), it has not 
been widely adopted. This may be due to the difficulties and costs associated 



A 1 fluorophores/probe, ~20bp/probe 
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Target mRNA 
B 3-5 fluorophores/probe, ~50bp/probe 

3' 5' 

Target mRNA 

Figure 17.1 Comparison between two in situ hybridization methods for imaging a 
single mRNA molecule. (A) Method of Raj et al. (2008) involves about 30 or more 
singly labeled probes, each about 20 bases long, that bind along the stretch of a target 
mRNA molecule. (B) Method of Femino et al. (1998) involves multiple fluorophores 
(between 3 and 5) coupled to a single oligonucleotide probe of about 50 bases long that 
bind along the stretch of a target mRNA molecule. 
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with synthesizing and purifying several oligonucleotides with the internal 
modifications required to label those oligonucleotides with multiple fluor- 
ophores. Another potential issue is self-quenching between tightly spaced 
fluorophores. We anticipate that the simplicity of the method described 
herein will allow many researchers to utilize single-molecule RNA FISH in 
their own studies. 




2. RNA FISH Protocol 

A brief overview of our method is as follows. A set of short (between 
17 and 22 bases long) oligonucleotide probes that bind to a desired target 
mRNA are designed and are coupled to a fluorophore (such that one oligo- 
nucleotide probe is bound to a single fluorophore) with desired spectral 
properties. After fixing the yeast cells, these probes are hybridized to the target 
mRNA molecule. This results in multiple (typically about 48) singly labeled 
probes bound to a single mRNA molecule. In turn, the mRNA molecule can 
give off enough fluorescence to be detected as a diffraction-limited spot using a 
standard fluorescent microscope. Below we describe a step-by-step procedure 
for implementing RNA FISH in Saccharomyces cerevisiae. 



2.1. Designing oligonucleotides 

The first step is the design of a collection of oligonucleotide probes that 
together are complimentary to a large part of the open read frame of the 
target mRNA (one can also utilize the untranslated regions of the mRNA, if 
necessary). Each probe is between 17 and 22 bases long and we have 
generally found that 30 or more such probes are sufficient to give a 
detectable signal. We have also found that our signals are sometimes clearer 
when the GC content of each probe is close to 45%. We also leave a 
minimum of two bases as a spacer between two adjacent probes that 
cover the mRNA, although it is possible that one can relax this requirement 
without any adverse effects. A program that facilitates the designing of 
probes meeting the constraints mentioned above is available freely at 
http://www.singlemoleculefish.com. Sometimes it is not possible to design 
probes that meet all the constraints mentioned above, and these criteria 
should not be viewed as absolutes, but more as guidelines we try to adhere 
to when possible. After designing the probes, we order them from compa- 
nies with parallel synthesis capabilities (we use BioSearch Technologies 
based in Novato, CA, USA) with S'-amine modifications. Since the syn- 
thesis typically results in a much larger number of oligonucleotides than are 
necessary, one should have them synthesized on the smallest possible 
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scale (we typically have them synthesized on the 10 nmol (delivered) scale). 
The 3 / -amine then serves as a reactive group for the succinomidyl-ester 
coupling of the fluorophore described in Section 2.2. 

2.2. Coupling fluorophores to oligonucleotides 

The next step is the attachment of a fluorophore with desired spectral 
properties to the commercially synthesized oligonucleotides (we will 
describe which fluorophores we use in Section 2.2.1.) We do this by 
pooling the oligonucleotides and coupling them en masse, thus reducing 
the labor involved. In all the steps we describe below, we use RNase free 
water (Ambion) to prepare our solutions and use filtered pipette tips to 
prevent aerosol contaminations. 
Procedure: 

1. From the commercially synthesized set of oligonucleotides, each at a 
concentration of 100 fiM'm RNase free water (we find this is a practical 
starting concentration to work with), pipette around 1 nmol/ 10 fA of 
each oligonucleotide probe into a single microcentrifuge tube (i.e., if 
there are 48 probes, then 1 nmol of each of the 48 probe solutions should 
be combined into a single tube with a final volume of 480 fA). 

2. Add 0.11 volumes (v/v) of 1 M sodium bicarbonate (prepared with 
RNase free water) to this probe mixture, resulting in a final sodium 
bicarbonate concentration of 0.1 M. If the total volume of the mixture at 
this stage is less than 0.3 ml, add enough 0.1 M sodium bicarbonate to 
bring the final volume of the mixture to 0.3 ml. 

3. Dissolve roughly 0.2 mg of the desired fluorophore (functionalized with 
a succinimidyl ester group) separately into a tube containing 50 fA of 
0.1 M sodium bicarbonate. If using tetramethylrhodamine (TMR), first 
dissolve it in about 5 fA of dimethyl sulfoxide (DMSO) and then add 
50 fA of 0.1 M sodium bicarbonate to it. This is because TMR does not 
readily dissolve in aqueous solutions. 

4. Add the dissolved fluorophore to the 0.3 ml of probe mixture, vortex, 
and cover this tube in aluminum foil to prevent photobleaching from 
unwanted exposure to ambient light. Leave the tube in the dark 
overnight. 

5. Next day, precipitate the probes out of solution by adding 12% (v/v) of 
sodium acetate at pH 5.2 followed by 2.5 volumes of ethanol (95% or 100%) . 

6. Place the tube at —70 °C for at least 1 h, then spin the sample down at 
16,000 rpm for at least 15 min at 4 °C. 

7. A small colored pellet should have collected at the bottom of the tube at 
this stage. This pellet contains both the coupled and uncoupled oligo- 
nucleotides. The vast majority of the uncoupled fluorophore, however, 
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remains in the supernatant, and so aspirate as much of this supernatant 
away as possible without disturbing the pellet (one should take care to 
aspirate soon after removal from the centrifuge, since oligonucleotides 
have a tendency to redissolve rapidly at room temperature. 

Note: Many precipitation protocols now call for another washing step in 
70% ethanol. We have found this step unnecessary. 

8. The pellet is stable and can be stored in —20 °C for up to 1 year. This 
concludes the coupling step. 



2.2.1. Choice of fluorophore and appropriate filter sets 

In order to perform imaging of multiple different RNA species at the same 
time, one needs to select fluorophores with excitation and emission properties 
that can be distinguished by appropriately chosen bandpass filters; otherwise, 
the signal from one channel may potentially bleed into another channel. We 
describe here the fluorophore and filter set combination that we use for our 
microscopy. Other combinations are no doubt feasible as well. 

The fluorophores we utilize are TMR, Alexa 594, and Cy5. TMR has 
proven to be exceptionally photostable in our hands, and its excitation 
maximum of 550 nm aligns nicely with the excitation maxima of mercury 
and metal-halide light sources. Alexa 594 is also quite photostable, and 
while its spectral properties are similar to those of TMR (absorption at 
594 nm), we are able to distinguish its presence using appropriate filters. 
The third fluorophore we use is Cy5, which is rather bright and is spectrally 
separated from the other two fluorophores (Cy5 absorbs at 650 nm). Cy5 
does, however, suffer from photobleaching effects, thus requiring the use of 
a glucose oxidase oxygen scavenging system to make imaging feasible. We 
have not tried any dyes that are further redshifted than Cy5. However, we 
have experimented with Alexa 488, which absorbs at a lower wavelength 
than TMR. While we were sometimes able to detect signals, the higher 
cellular background at these lower wavelengths lead to weaker signals, so we 
generally avoid the use of fluorophores bluer than TMR. 

The filter combinations we use are typical bandpass filter and dichroic 
sets mounted in cubes that the microscope can place in the fluorescence 
light path. For TMR, we use a standard XF204 filter from Omega Optical. 
For Alexa 594, we use a custom filter from Omega Optical with a 590DF10 
excitation filter, a 610DRLP dichroic, and a 630DF30 emission filter. For 
Cy5, we use the 41023 filter from chroma, which is designed for Cy5.5. It is 
likely that a filter more appropriate for Cy5 would work even better. These 
filters do a good job of preventing any signals from one fluorophore 
from being detected in another channel (Raj et al, 2008). Sometimes a 
very bright Alexa 594 signal can bleed somewhat into the TMR channel 
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(we estimate the bleedthrough to be about 10%) but practically this bleed- 
through is impossible to detect owing to the low signal intensities of the 
mRNA spots. 



2.3. Purification of probes using HPLC 

We now describe a purification procedure for separating the coupled 
oligonucleotides from the uncoupled oligonucleotides. We purify the cou- 
pled oligonucleotides using HPLC (high-performance liquid chromatogra- 
phy): the addition of the fluorophore makes the normally hydrophilic 
oligonucleotide significantly more hydrophobic, allowing for separation 
by chromatography. The HPLC should be equipped with a dual wave- 
length detector for a simultaneous measurement of absorption by DNA 
(at 260 nm) and fluorophore (depends on the fluorophore: e.g., 555 nm 
for TMR and 594 nm for Alexa 594). In our lab, we have used an Agilent 
1090 equipped with Chemstation software and a CI 8 column suitable 
for oligonucleotide purification (218TP104). The two buffers used for 
HPLC are: 0.1 M triethylammonium acetate ("Buffer A") and acetonitrile 

("Buffer B"). 
Procedure: 

1. Before running the purification program on the HPLC, equilibrate the 
column by flowing 93% Buffer A/7% Buffer B through for about 
10 min; if the column is not equilibrated, then the oligonucleotides 
will simply flow straight through without any separation. 

2. Resuspend the oligonucleotide pellet in an appropriate volume of water 
(we use 115 /il) and then inject this into the HPLC inlet. 

3. Run an HPLC program in which the percentage of Buffer A varies from 
7% to 30% over the course of about 45 min with a flow rate of 1 ml/min. 
During the execution of the program, carefully monitor the two absorp- 
tion curves, one for DNA (at 260 nm) and the other for the coupled 
fluorophore (e.g., 555 nm for TMR and 594 nm for Alexa 594). 
Generally speaking, one will observe two broad peaks over time. The first 
peak, containing the more hydrophilic material, consists of the uncoupled 
oligonucleotides and will only exhibit absorption in the 260 nm channel 
(Fig. 17. 2A). This peak may appear relatively ragged due to the presence of 
multiple oligonucleotides, each of which has a slightly different retention 
time in the HPLC. The second peak, often narrower than the first, will 
appear some time after the first peak and contains the coupled oligonucleo- 
tides; thus, it will show absorption in both the 260 nm and the fluorescent 
(e.g., 555 nm) channels (Fig. 17. 2B). The duration of time between the first 
and second peaks varies depending on the hydrophobicity of the fluoro- 
phore; we have found that oligonucleotides coupled to Cy5 have a long 
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Figure 17.2 Chromato graphs obtained during the HPLC purification of oligonucleo- 
tides coupled to the fluorophore (Alexa 594) from uncoupled oligonucleotides. 

(A) Absorption (at 260 nm, for DNA) curve as a function of time monitored during 
purification of probes coupled to Alexa 594 using HPLC. The first peak that appears 
between 20 and 30 min in this channel correspond to oligonucleotide probes that do not 
have Alexa 594 coupled to them. Eluate is not collected for the duration of this peak. 

(B) Absorption (at 594 nm, for Alexa 594) curve as a function of time. Both absorption 
curves (A) and (B) are obtained simultaneously for the duration of the HPLC run. Only 
one distinct peak appears in this channel, representing absorption by probes with Alexa 
594 successfully coupled to them. This peak coincides with the second peak in the 
260 nm channel shown in (A). Eluate is collected for the entire duration of this peak in 
the 594 nm channel. 
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retention time of almost 20 min after the first peak, whereas TMR and 
Alexa 594 result in shorter retention shifts (Fig. 17.2B). 
Collect the contents of this peak (in the flurophore absorption channel) 
manually into clean, RNase free tubes. It is important to collect all the 
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solution that is coming out of the outlet, starting from the beginning of 
the left shoulder of this second peak and stopping the collection just at 
the tail-end of the right shoulder of this second peak (Fig. 17.2B), 
because the different coupled oligonucleotides will have slightly differ- 
ent retention times; do not just ''collect the peak." This collection 
typically lasts around 3—7 min in our experience. With the volumes 
we mentioned for our HPLC setup above, we typically collect between 
5 and 14 ml in this step with 0.5 ml/ tube. The program we use then 
typically flows 70% Buffer B through the column for about 10 min. This 
step will "strip" the column of any impurities that may have stuck to the 
column and is especially important if you plan to purify additional 
probes. Be sure, however, to allow sufficient time for the column to 
reequilibrate to 7% B/93% A before injecting another sample. 

5. After collecting the solution of coupled probes, dry the collection in a 
SpeedVac rated for acetonitrile until the liquid is fully evaporated (about 
3—5 h). It is important to keep light out of the SpeedVac to avoid 
photobleaching of dyes, especially for highly photolabile cyanine dyes 
such as Cy3 and especially Cy5. 

6. Resuspend the contents in a total of 50— 100 jA of TE (10 mMTris with 
HC1 to adjust pH, 1 mM EDTA, Ambion) at pH 8.0. This final 
suspension solution is now the "probe stock." 

7. From the "probe stock," create dilutions of 1:10, 1:20, 1:50, and 1:100 in 
TE to make "working stocks." This dilution series is used to determine 
which concentration of probes yields the best signals for RNA FISH. 

8. Store these probes in dark at —20 °C until sample is ready to be 
prepared. We found that the probes can be stored for years in this way. 



2.4. Fixing S. cerevisiae 

Having isolated the coupled probes, it is now time to fix the yeast cells so 
that these probes can be hybridized to their target mRNAs in these cells. In 
the following procedure, we have adopted the procedure for fixing 
5. cerevisiae from Long et ah (1995). 
Procedure: 

1. Grow the yeast cells to an OD of around 0.1—0.2 (corresponding to 
about 1—2 X 10 cells/ml) in a 45-ml volume of minimal media with 
appropriate supplements (depending on the auxotroph) in a batch shaker 
at 30 °C (we use 225 rpm). 

2. Add 5 ml of 37% formaldehyde (i.e., 100% formalin) directly to the 
growth media containing the cells and let it sit for 45 min at room 
temperature to fix the cells. One should take safety precautions when 
using the carcinogen, formaldehyde (i.e., use chemical fume hood, 
gloves, and long-sleeved protective clothing). 
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3. Concentrate the cells in this 50 ml into a single microcentrifuge tube. 
We found that one way to concentrate the cells was to run the above 
50 ml mixture through a vacuum filter (with a filter paper having 0.2 /im 
pores: VWR vacuum filtration system "PES 0.2 /im") once, then shake 
the filter paper into an RNase free water. Alternately, one may simply 
centrifuge the content at 2300 rpm for about 5 min and then resuspend 
in 1 ml Buffer B to transfer to a microcentrifuge tube. 

4. Wash these concentrated cells in the microcentrifuge tube twice with 
1 ml ice-cold Buffer B (Long et al, 1995). 

5. Add 1 ml of spheroplasting buffer (from a stock made by adding 100 jA of 
200 mM vanadyl-ribonucleoside complex to 10 ml Buffer B), and 
transfer the mixture to a new RNase free microcentrifuge tube. 

6. Add 1 jA of zymolyase and incubate at 30 °C for about 15 min; this 
spheroplasting step removes the cell wall and is important for probe 
penetration. 

7. Wash the solution twice with 1 ml ice-cold Buffer B, with centrifuging 
the content at 2000 rpm for 2 min in between. 

8. Add 1 ml of 70% ethanol (diluted in RNase free water) to the cells and 
leave them for an hour or even overnight at 4 °C. 

The yeast cells have now been fixed and are ready for hybridization. 
These cells can be stored in ethanol for up to a week after fixation and 
perhaps even longer. 



2.5. Hybridizing probes to target mRNA 

The hybridization step contains three key parameters that may be varied to 
optimize the FISH signal. These are the temperature at which hybridization 
takes place, the concentration of formamide used in the hybridization and 
wash, and the concentration of the probe. The first two parameters essen- 
tially set the stringency of the hybridization; that is, the higher the tempera- 
ture or the concentration of formamide, the lower the likelihood of 
nonspecific binding of the probes. We usually elect to adjust the formamide 
concentration rather than temperature and thus perform all FISHs at 30 °C. 
Typically, we have found that hybridization and wash buffers containing 
10% formamide work quite nicely for most probes, yielding a fairly low 
background while also producing clear particulate signals. However, when 
the GC content of the probes is relatively high (> 55%), we have found that 
we sometimes have to employ formamide concentrations up to 20% or 
sometimes higher. However, care must be taken in these instances, since the 
use of higher formamide concentrations can sometimes lead to a greatly 
diminished signal. Generally, we try to obtain signals at a standard concen- 
tration of formamide, because this greatly facilitates the simultaneous 



Imaging Single mRNA Molecules in Yeast 439 

detection of multiple mRNAs: if the hybridization conditions are the same, 
multiplex detection is simply a matter of mix and match. 

The concentration of probe used is also very important in obtaining 
clear, low background signals. Typically, the optimal probe concentration 
must be found empirically, but we have found that concentrations can vary 
over roughly an order of magnitude and still produce satisfactory results. 
We typically start by using a 1:1000, 1:2500, and 1:5000 dilution of the 
original stock into hybridization buffer. One of these concentrations will 
usually yield good signals, but sometimes one must use drastically lower 
concentrations (100-fold lower) in order to obtain signals. 

2.5.1. Preparation of hybridization and wash buffers 

The following procedure describes preparation of 10 ml of hybridization 
buffer with the desired formamide concentration. Be sure to adjust the 
volumes appropriately if you are preparing a different total volume of 
hybridization buffer. 
Procedure: 

1. Dissolve 1 g of high molecular weight dextran sulfate (> 50,000) in 
approximately 5 ml of nuclear free water. Depending on the particular 
preparation of dextran sulfate used, the powder may dissolve quite rapidly 
with a bit of vortexing or may require rocking for several hours at room 
temperature. In the end, the solution should be clear and fairly viscous, 
although some preparations are far less viscous but still appear to work. 

2. Add 10 mg of E. coli tRNA (Sigma, 83854), vortexing to dissolve. 

3. Add 1 ml of 20x SSC (RNase free, Ambion), 40 jul (to get 0.02% in 
10 ml) of RNase free BSA (stock is 50 mg/ml = 5% solution from 
Ambion, AM261), 100 /il of 200 mM vanadyl-ribonucleoside complex 
(NEB S1402S), formamide to the desired concentration (10—30%), and 
then water to a final volume of 10 ml. When using formamide, one must 
first warm the solution to room temperature before opening to avoid 
oxidation; also, care must be taken when using formamide (i.e., use in 
the hood, wear protection, etc.) because it is a suspected carcinogen and 
teratogen and is readily absorbed through the skin. 

4. Once the solution is thoroughly mixed, filter the buffers into small aliquots; 
this removes any potential clumps that can yield a spotty background. 
We simply filter the solution in 500 jA aliquots using cartridge filters from 
Ambion. 

5. Store the solution at —20 °C for later use; solution is typically good for 
several months to a year. 

6. Prepare the wash buffer by combining 5 ml of 20 X SSC (Ambion), 5 ml 
of formamide (to final concentration of 10% (v/v); this is adjusted if the 
hybridization buffer has a different formamide concentration), and 40 ml 
of RNase free water (Ambion) into one solution. 
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2.5.2. Hybridizing probes to yeast cells in solution 

Procedure: 

1. Warm the hybridization solution to room temperature before open- 
ing its cap to prevent oxidation of the formamide. 

2. Add 1—3 jA of desired concentration of probes to 100 jA of the 
hybridization buffer. To determine what the desired concentration 
of probes is, we initially perform hybridizations with four dilutions of 
probes: 1:10, 1:20, 1:50, and 1:100 (mentioned in Section 2.3), and 
see which dilution gives the clearest signal. 

3 . Centrifuge the fixed sample and aspirate away the ethanol, then resus- 
pend the fixed cells in a 1-ml wash buffer containing the same formam- 
ide concentration as the hybridization buffer. 

4. Let the resuspension stand for about 2—5 min at room temperature. 

5. Centrifuge the sample and aspirate the wash buffer. Then add the 
hybridization solution. 

6. Incubate the sample overnight in the dark at 30 °C. 

7. Next morning, add 1 ml of wash buffer to this sample, vortex, 
centrifuge, then aspirate away the supernatant. 

8. Resuspend in 1 ml of wash buffer, then incubate in 30 °C for 30 min. 

9. Repeat the wash in another 1 ml of wash buffer for another 30 min at 
30 °C, this time adding 1 jA of 5 mg/ml DAPI for a nuclear stain. 

10A. If using photo stable fluorophores such as TMR or Alex a 594: then there is 
no need to add the GLOX solution. Just resuspend the sample in an 
appropriate volume (larger than 0.1 ml) of 2x SSC and proceed to 
imaging. 

10B. If using a highly photolabile fluorophore such as Cy5: resuspend the fixed 
cells in the GLOX buffer (used as an oxygen-scavenger that removes 
oxygen from the medium to prevent light-initiated fluorophore 
destroying-reactions; see Section 2.5.3) without the enzymes and 
incubate it for about 2 min for equilibration (see Section 2.5.3 for 
details). Then centrifuge, aspirate away the buffer and resuspend the 
cells in a 100-/il of GLOX buffer with the enzymes (glucose oxidase 
and catalase). These cells are now ready to be imaged. 

We found that our samples (either with or without the antibleach 
solution) can be kept at 4 °C for a day's worth of imaging. Keeping the 
samples at 4 °C prevents the probe-target hybrids from dissociating and thus 
degrading the signals. 

2.5.3. Preparation of antibleach solution and enzymes 
("GLOX solution") 

During imaging, we typically take several vertical stacks ("z-stacks") of 
images through a cell in a field of view, causing a hybridized fluorophore 
in a fixed cell to be excited by intense light several times. More importantly, 
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when more than one type of fluorophore is used for imaging two or three 
species of mRNA, such z-stacks must be repeated to excite each of the 
different fluorophores, leading to even more exposure of the fluorophores. 
In our experience, only TMR and Alexa 594 could withstand such repeated 
excitations, whereas Cy5 signal would rapidly degrade due to its especially 
high rate of photobleaching. To decrease the photolability of Cy5, we used 
an oxygen-scavenging system consisting of catalase, glucose oxidase, and 
glucose (GLOX solution) that is slightly modified from that used by Yildiz 
et al. (2003) . This GLOX solution acts as an oxygen-scavenger that removes 
oxygen from the medium. Since the light-initiated reactions that destroy 
fluorophores require oxygen, the GLOX buffer thus prohibits these reac- 
tions from taking place. Indeed, we found that Cy5 was able to withstand 
nearly 10 times more exposure with the GLOX solution than without it. 
The following is a procedure for preparing the GLOX solution. 
Procedure: 

1. Mix together 0.85 ml of RNase free water with 100 fA of 20 X SSC, 
40 fA of 10% glucose, and 5 fA of 2 M Tris-Cl (pH 8.0). This is the 
GLOX buffer (without glucose oxidase and catalase). 

2. Vortex the mixture, and then aliquot 100 fA of it into another tube. 

3. To this 100 fA aliquot of GLOX buffer (glucose— oxygen-scavenging 
solution without enzymes), add 1 fA of glucose oxidase (from 3.7 mg/ml 
stock, dissolved in 50 mM sodium acetate, pH 5.2, Sigma) and 1 fA of 
catalase (Sigma). Before pipetting the catalase, vortex it a bit, since the 
catalase is kept in suspension (also, care should be taken when handling 
the catalase, since it has a tendency to get contaminated). This 100 fA will 
be referred to as "GLOX solution with enzymes.' 5 The GLOX solution 
without the enzyme will later be used as an equilibration buffer. 

2.5.4. Imaging samples using fluorescent microscope 

The fixed cells with probes properly hybridized are now ready for imaging. 
Our microscopy system is relatively standard: we use a Nikon TE2000 
inverted widefield epifluorescence microscope. It is important to use a fairly 
bright light source. For instance, a standard mercury lamp will suffice, 
although the newer metal-halide light sources (e.g., Lumen 200 from 
Prior) tend to produce a more intense and uniform illumination. Another 
important factor is the camera. It is important to use a cooled CCD camera 
that is optimized for low-light imaging rather than acquisition speed; we use 
a Pixis camera from Roper. Also, the camera should have a pixel size of 
13 fim or less. We should point out that the signals from the newer 
EMCCD cameras are no better than these more standard (and cheaper) 
cooled CCD cameras. We typically use a 100 X DIC objective. If one is 
interested in imaging with Cy5, one must be sure that the objective has 
sufficient light transmission at those longer wavelengths; this can sometimes 
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require an IR coating. When mounting the cells, it is important to make 
sure that one uses #1 coverglass (18 mm X 18 mm, 1 ounce) and that the 
yeast are directly on the coverglass: do not adhere the yeast to the slide 
and then cover with coverglass. One can enhance the adherence of the yeast 
to the coverglass by coating the coverglass with poly-L-lysine (put fresh 
1 mg/ml poly-L-lysine solution on the coverglass for 20 min, then suction 
off) or concanavalin A. It is also important to use #1 coverglass: we have 
found that even though most objectives are corrected for #1.5 coverglass, 
the mRNA spots are usually fuzzier and less distinct when imaged through 
#1.5 coverglass. 

There are two somewhat standard procedures often employed during 
fluorescence microscopy that we have found interferes with our single 
mRNA signals. One of these is the use of commercial antifade mounting 
solutions, which tend to introduce a large background while also decreasing 
the fluorescent signals from target mRNAs. We recommend instead 
using the custom made GLOX solution or 2x SSC for imaging, being 
careful not to let the sample dry out. We also discourage using the standard 
practice of using a nail polish to seal the sample, as it introduces a background 
autofluorescence in the red channels that interferes with fluorescence 
from mRNA. 

2.6. Image processing: Detecting diffraction-limited 
mRNA spot 

We have devised an algorithm that automates some fraction of the work 
involved in analyzing images obtained from the samples (Raj et ah, 2008). 
The first step in our algorithm is applying a three-dimensional linear filter 
that is approximately a Gaussian convolved with a Laplacian to remove the 
nonuniform background while enhancing the signals from individual 
mRNA particles, thus enhancing the signal-to-noise ratio (SNR) 
(Fig. 17. 3B). The full width at half maximum of this Gaussian corresponds 
to the optimal bandwidth of our filter, and depends on the size of the 
observed particle. This width is a fit parameter that we empirically adjust to 
maximize the SNR. However, even after filtering the images, they will 
contain some noise that requires thresholding to remove. In order to make a 
principled choice of threshold, we sweep over a range of possible values of 
the threshold, and plot the number of mRNAs detected at each value 
(Fig. 17. 3C). Here, a single mRNA is defined as a collection of localized 
pixels (in the series of z-stacks) that form a connected component 
(Fig. 17. 3D). We then typically find a plateau in this plot of the number 
of mRNAs counted as a function of the value of the threshold 
(i.e., increasing the threshold does not change the number of mRNAs 
counted) as seen in Fig. 17.3C. This implies that the signals from mRNAs 
are well separated from the background noise rather than a smooth 
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Figure 17.3 Example of mRNA spot detection algorithm applied to raw images of 
FKBP5 mRNA particles in A549 cells induced with dexamethasone. (A) Raw image 
data showing FKBP5 mRNA particles. (B) Upon applying a three-dimensional linear 
filter that is approximately a Gaussian convolved with a Laplacian to remove the 
nonuniform background while enhancing the signals from individual mRNA particles 
on the raw image shown in (A) the SNR is increased. (C) The number of spots counted 
as a function of the threshold value of the background after the application of the linear 
filter shows an existence of a plateau. This indicates a clear distinction between 
background fluorescence and actual mRNA spots. (D) Using the value of threshold 
shown as the gray line in (C), the raw image (A) has been transformed to an image in 
which each distinct computationally identified spot has been assigned a random color to 
facilitate visualization. Reprinted with permission from Raj et ah (2008). 



"blending" in of the mRNA signals with the background noise. Indeed, the 
value of threshold chosen in this plateau range yielded mRNA counts nearly 
equal to the mRNA counts we obtained through an independent method 
in which we count by eye without the aid of automation. The software used 
for this purpose is available for download on Nature method's supplementary 
information site for Raj et ah (2008). One can also make measurements 
based on mRNA spot intensity, although we feel that great care must be 
taken in these situations. One issue is that the intensity depends on how 
precisely focused the spot is, although this can be ameliorated by taking a 
large number of closely spaced fluorescent stacks. Another problem with 
computing total or mean intensity is that the boundary of the mRNA is 
hard to define, and the ultimate intensity measurement will depend heavily 
on this somewhat arbitrary choice. One way to skirt the issue is to use the 
maximum intensity within a given spot, since this is independent of the size 
of the spot. 
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3. Example: STL1 mRNA Detection in Response to 
NaCl Shock 



As an application of the FISH technique we just outlined, and we now 
show an example of this technique applied to S. cerevisiae. One mRNA of 
interest in yeast is that of the STL1 gene, whose expression level dramatically 
increases when the cell is subjected to an osmotic shock (Rep et al., 2000). 
One way to induce such a shock is by increasing the concentration of NaCl in 
the cell's growth medium. For this purpose, a strain based on the common 
laboratory strain BY4741 (Mat a, his3Al leu2A0 metl5A0 ura3A0, YER1 18 c :: 
kanMX ) was grown to an OD of 0.56 (~0.7 X 10 cells/ml) in a 50-ml 
volume of complete supplemental media without histidine and uracil. We then 
shocked them osmotically by transferring the cells to a medium with 0.4 M 
NaCl and leaving them there for 10 min. We fixed these shocked cells along 
with their unshocked counterparts using the method we outlined before (5 ml 
of 37% formaldehyde was added directly to the medium for 45 min). We 
adopted the fixation and spheroplasting procedures were from Long et al. 
(1995), but with the exception that after spheroplasting, the cells were incu- 
bated in concanavalain A (0.1 mg/ml, Sigma) for about 2 h before letting them 
settle onto a coverglass with a chamber that was coated with concanavalin A 
overnight. We used concanavalin A because it helped the yeast cells stick to a 
cover glass, although as mentioned earlier, it is possible also to simply use poly- 
L-lysine coated coverglass without incubating the cells in concanavalin A. The 
resulting images of RNA FISH performed on unshocked and shocked cells can 
be seen in Fig. 17.4A and B, respectively. As seen in these figures, the RNA 
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Figure 17.4 Single mRNA molecules imaged in S. cerevisiae using the fluorescence in 
situ hybridization method of Raj et al. (2008). Scale bars (white lines) indicate 5 /am. 
(A) STL1 mRNA particles in yeast cells before being subjected to osmotic shock (0 M 
NaCl in the growth medium). (B) STL1 mRNA particles in yeast cells 10 min after they 
have been growing in the presence of a high level of salt (0.4 M NaCl), thus inducing 
osmotic shock. DAPI was used to stain the nucleus of the cells shown in purple. The 
STL1 gene expression increases dramatically after the osmotic shock. Reprinted with 
permission from Raj et al. (2008). 
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FISH technique of these workers (Raj et al. , 2008) allows one to not only 
resolve individual STL1 mRNAs but also to extract spatial information on 
their whereabouts (helped by DAPI staining of the cell's nucleus). In addition, 
taking snapshots of STL1 mRNAs at two different time points as shown in 
Fig. 17.4A and B illustrates how one can construct dynamics of the mRNA 
distribution in a population of cells by performing FISH on the cells at different 
time points. 




4. Conclusions 

Although we have limited our description of RNA FISH to just 
S. cerevisiae, this method has so far been applied to E. coli, Caenorhabditis 
elegans, Drosiphila melanogaster, and rat hippocampus neuronal cell cultures 
(Raj et ah, 2008). In fact, the protocol we described requires just a few 
adjustments in order to be applicable to these organisms. The method is 
likely to be applicable to other organisms as well. Studying how individual 
yeast cells behave through single cell measurements and using the distribu- 
tions constructed through those measurements to look at how populations 
of cells behave remains a vital field of research today. We believe that the 
FISH method for visualizing a single mRNA molecule in yeast will play an 
important role in such endeavors. 



ACKNOWLEDGMENTS 

We thank G. Neuert for sharing with us his unpublished STL1 RNA FISH data. A. v. O. 
was supported by NSF grant PHY-0548484 and NIH grant R01-GM077183. A. R. was 
supported by an NSF fellowship DMS-0603392 and a Burroughs Wellcome Fund Career 
Award at the Scientific Interface. H. Y. was partly supported by the Natural Sciences and 
Engineering Research Council of Canada's Graduate Fellowship PGS-D2. 



REFERENCES 

Beach, D., Salmon, E., and Bloom, K. (1999). Localization and anchoring of mRNA in 

budding yeast. Curt. Biol. 9, 569—578. 
Bertrand, E., Chartrand, P., Schaefer, M., Shenoy, S., Singer, R., and Long, R. (1998). 

Localization of ASH 1 mRNA particles in living yeast. Mol. Cell 2, 437-445. 
Chubb, J., Trcek, T., Shenoy, S., and Singer, R. (2006). Transcriptional pulsing of a 

developmental gene. Curt. Biol. 16, 1018-1025. 
Femino, A., Fay, F., Fogarty, K., and Singer, R. (1998). Visualization of single RNA 

transcripts in situ. Science 280, 585—590. 
Gall, J. (1968). Differential synthesis of the genes for ribosomal RNA during amphibian 

oogenesis. Proc. Natl. Acad. Sci. USA 60, 553-560. 



446 Hyun Youk et al. 



Golding, I., Paulsson, J., Zawilski, S., and Cox, E. (2005). Real-time kinetics of gene activity 

in individual bacteria. Cell 123, 1025—1036. 
Levsky, J., and Singer, R. (2003). Fluorescence in situ hybridization: Past, present and future. 

J. Cell Sci. 116, 2833-2838. 
Long, R., Elliott, D., Stutz, F., Rosbash, M., and Singer, R. (1995). Spatial consequences of 

defective processing of specific yeast mRNAs revealed by fluorescent in situ hybrdization. 

RNA 1, 1071-1078. 
Long, R., Singer, R., Meng, X., Gonzalez, I., Nasmyth, K., andjansen, R. (1997). Mating 

type switching in yeast controlled by asymmetric localization of ASH1 mRNA. Science 

277, 383-387. 
Maamar, H., Raj, A., and Dubnau, D. (2007). Noise in gene expression determines cell fate 

in bacillus subtilis. Science 317, 526-529. 
Raj, A., and van Oudenaarden, A. (2008). Nature, nurture, or chance: Stochastic gene 

expression and its consequences. Cell 135, 216-226. 
Raj, A., Peskin, C, Tranchina, D., Vargas, D., and Tyagi, S. (2006). Stochastic mRNA 

synthesis in mammalian cells. PLoS Biol. 4, e309. 
Raj, A., van den Bogaard, P., Rifkin, S., van Oudenaarden, A., and Tyagi, S. (2008). 

Imaging individual mRNA molecules using multiple singly labeled probes. Nat. Methods 

5, 877-879. 
Rep, M., Krantz, M., Thevelein, J., and Hohmann, S. (2000). The transcriptional response 

of Saccharomyces cerevisiae to osmotic shock. Hotlp and Msn2p/Msn4p are required for the 

induction of subsets of high osmolarity glycerol pathway-dependent genes. J. Biol. Chem. 

275, 8290-8300. 
Rosenfeld, N., Young, J., Alon, U., Swain, P., and Elowitz, M. (2005). Gene regulation at 

the single-cell level. Science 307, 1962—1965. 
Sindelar, L., and Jaklevic, J. (1995). High-throughput DNA synthesis in a multichannel 

format. Nucleic Acids Res. 23, 982-987. 
Yildiz, A., Forkey, J., McKinney, S., Ha, T., Goldman, Y., and Selvin, P. (2003). Myosin V 

walks hand-over-hand: Single fluorophore imaging with 1.5 nm localization. Science 300, 

2061-2065. 
Zenklusen, D., Wells, A., Condeelis, J., and Singer, R. (2007). Imaging real-time gene 

expression in living yeast. Cold Spring Harb. Protoc. DOI: 10.1101/pdb.prot4870. 




CHAPTER EIGHTEEN 



Reconstructing Gene Histories 
in ascomycota fungi 

llan Wapinski*'* and Aviv Regev*' 1 ' 



Contents 

1. Introduction 448 

2. Synergy 451 

2.1. Overview 451 

2.2. Defining orthogroups 451 

2.3. Scoring gene similarity 453 

2.4. Gene similarity graph 454 

2.5. Identifying orthogroups 454 

3. Evaluating Orthogroup Quality 459 

3.1. Fungal orthogroup robustness 461 

3.2. Comparison to curated resource 463 

3.3. Simulated orthogroups 465 

4. Biological Analysis of Gene Histories 466 

4.1. Defining orthogroup categories 466 

4.2. Singletons and ORF predictions 468 

4.3. Gene sets and orthogroup projections 468 

4.4. Copy-number variation profiles 471 

5. Analysis of Paralogous Genes 478 

5.1. Estimating functional divergence between paralogous genes 478 

5.2. Estimating divergence based on degree of 

conserved interactions 479 

6. Discussion and General Applicability 480 
References 482 

Abstract 
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whole-genome duplication event. Such studies first require an accurate and 
comprehensive mapping of all orthologous loci across all species. In this 
chapter, we present a computational framework for systematic reconstruction 
of all gene orthology relations across multiple yeast species. We then discuss 
how to use the resulting genome- and species-wide catalogue of gene phyto- 
genies to study the histories of gene duplications and losses from a functional 
genomics perspective. We show how these methods allowed us to uncover the 
functional constraints underlying gene duplications and losses within Ascomycota 
fungi, and to highlight the importance and limitations of these evolutionary 
processes. The analytical framework we present here is generalizable and 
scalable, and can be applied to an array of comparative genomics needs. 




1. Introduction 

Comparative genomics is a powerful tool for evolutionary and 
functional studies of biological systems (Cliften et ah, 2003; Kellis et al, 
2003). Such studies require a reconstruction of the evolutionary history of 
individual genes, and their relation to one another through speciation 
(orthologs) or duplication (paralogs) events (Fitch, 1970). Applying these 
concepts at a genome-wide scale (phylogenomics; Eisen, 1998; Eisen and 
Fraser, 2003; Eisen and Wu, 2002) allows us to functionally characterize 
and classify genes (Engelhardt et al, 2005; Tatusov et ah, 1997), and to 
understand the evolutionary impact of genomic events (Blomme et al, 
2006; Dietrich et al, 2004; Kellis et al, 2004a; Scannell et al, 2006). 

There are two broad categories of methods for the identification of 
orthologous and paralogous genes. The first class of methods infer homology 
relations based on hit- clustering, using the results ("hits") from sequence 
similarity searches such as BLAST or FASTA (Altschul et al, 1990; Pearson 
and Lipman, 1988) between all the proteins in different species to output an 
orthology assignment between the genes. The most widely used variant of 
this approach is "reciprocal (bidirectional) best hits" (RBH) where two genes 
in two different species are identified as orthologs if each is the others' best 
"hit" in that species (Fitch, 1970; Wall et al, 2003). Related approaches 
include more inclusive clustering methods (e.g., COGs; Tatusov etal, 1997), 
and algorithms designed to distinguish between recent and ancient gene 
duplications (e.g., InParanoid, Remm et al, 2001; OrthoMCL, Li et al, 
2003). One recent extension of a hit-clustering algorithm incorporated 
information on orthologous chromosomal regions (synteny) to guide orthol- 
ogy assignments (e.g., BUS; Kellis et al, 2004b). Synteny-based methods are 
particularly helpful in handling orthology assignments that are ambiguous 
based on hit-clustering alone, but they cannot be applied between distantly 
related species, where gene order is not sufficiently conserved. Hit-clustering 
methods are easy to implement and fast, but they do not explicitly reconstruct 
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the evolutionary history of orthologous genes, as they either ignore paralogs 
altogether by assuming that orthology is a one-to-one relationship (e.g., 
RBH) or do not resolve exact orthology and paralogy relations when identi- 
fying genes with shared ancestry (e.g., COGs). 

A complementary set of approaches identifies homology relations in 
light of the phylogenetic gene tree of a related group of genes. These allow us 
to infer lineage-specific duplications and losses by comparison to the 
corresponding species tree (reconciliation; Goodman et ah, 1979; Zmasek 
and Eddy, 2001; Fig. 18.1). The main limitation of these methods is that 
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Figure 18.1 Homology subtypes — orthology and paralogy — within a group of ortho- 
logous genes. (A) Species tree. Each node (square) in the tree is a species — either extant 
(leaf node) or ancestral (internal node). In this toy example, speciation events 1 and 2 
have resulted in extant species a, b, and c. (B) A gene tree describing the evolutionary 
events for the genes g*, g*, g^, and g^ (informally denoted on the species tree for 
illustrative purposes). Each node in the tree is a gene (circle) or a duplication event 
(star). The tree shows the evolutionary descent of the ancestral gene g* to paralogs and 
orthologs following gene duplication before species y, and the subsequent speciation 
yielding species a and b. Gene g^ was lost (strike and dashed lines) after the duplication 
event, but its paralog, g\, was retained. (C) Synteny between chromosomal regions in 
species a and b. Each chromosome has several similar (syntenic) blocks (hatched boxes) 
comprised of multiple genes. Some regions in one genome (gray box) do not have a 
syntenic counterpart in the other. The synteny similarity score for a pair of genes is the 
fraction of their neighbors that are orthologous to each other. For example, the score 
for g* and g^ is 2/3. 
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single-locus phylogenies are rarely accurate, especially when considering 
a large number of taxa, where there are many possible tree topologies 
(as discussed in Felsenstein, 2004). The resulting gene trees may, therefore, 
require a high number of duplication and loss events to be reconciled, even 
when there is only a single copy of a particular locus in each taxa; an 
unlikely scenario assuming such events are relatively rare. Recent methods 
attempt to balance the number of inferred duplications and losses with 
evidence derived from sequence alignments (Arvestad et ah, 2003; Durand 
et ah, 2006). While such approaches often result in high-quality reconstruc- 
tions of gene histories, they are computationally intensive and have, 
therefore, been typically restricted to predefined families of genes rather 
than applied on a genomic scale. 

Recent efforts to apply phylogenetic methods toward large-scale resolu- 
tion of orthologies (Goodstadt and Ponting, 2006; Hahn et ah, 2007; Jothi 
et ah, 2006), handle the task in a sequential way: first, they use hit-clustering 
methods to identify coarse gene families and then construct gene trees to 
refine these assignments. With few exceptions (Hahn et ah, 2007), the latter 
phylogenetic step does not employ the more sophisticated but computa- 
tionally intensive phylogenetic algorithms. Thus, it does not account for 
gene tree distortions that induce large numbers of unlikely duplication and 
loss events (Blomme et ah, 2006; Dufayard et ah, 2005; Jothi et ah, 2006). 
Such distortions are common as genes within families often evolve at 
variable rates, especially following gene duplication events (Kellis et ah, 
2004a; Lynch and Katju, 2004). Consequently, laborious manual curation 
by experts may be required to achieve reasonable results (Li et ah, 2006), or 
more complicated families must be ignored a priori (Blomme et ah, 2006). 
Further, these approaches have not incorporated synteny, an important 
source of evidence for determining gene ancestry. 

Here, we describe a framework for the genome-wide reconstruction of 
homology relations across multiple eukaryotic genomes and present a fully 
automatic and scalable implementation of this framework in the Synergy 
algorithm (Wapinski et ah, 2007a,b). Given a set of genomes and the known 
species phylogeny, Synergy resolves the orthology and paralogy relations for 
all the protein-coding genes in those genomes, while simultaneously recon- 
structing the phylogenetic gene trees for each group of orthologs. Our 
approach combines the scalability and automation of hit clustering 
approaches with the detailed phylogenetic reconstruction of tree-based 
methods, resulting in a robust resolution of homology relations. By simul- 
taneously reconstructing gene trees while identifying orthologous groups, 
our approach avoids many of the pitfalls of sequential methods. This 
method is flexible and can incorporate additional types of data whenever 
available (e.g., synteny). To automatically assess the quality of our assign- 
ments, we also develop a jackknife-based method for measuring their 
robustness to perturbations in the included genes and species. 
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We demonstrate the quality and utility of such reconstructions in an 
analysis of gene histories in Ascomycota fungi. In particular, we show how to 
evaluate the quality of gene orthologies and how to apply an array of 
functional genomics data and techniques to study the functional constraints 
on gene duplications and losses and the impact of gene duplications on 
cellular networks. The Synergy algorithm and these analytical approaches 
are broadly applicable to the study of gene and genome evolution in other 
microbial and eukaryotic genomes. 




2. Synergy 

2.1. Overview 

Given a set of species, their protein-coding genes, and their phylogenetic 
tree, Synergy partitions the genes into disjoint subsets, where each subset 
contains all and only those genes that descended from a single gene in the 
species' last common ancestor. Synergy simultaneously reconstructs the 
phylogenetic gene tree for each such subset of genes. Briefly, Synergy 
performs this task in a step-wise bottom-up fashion, solving it sequentially 
for each ancestral node in a species tree from the leaves of the tree to the 
root. At each stage (i.e., node in the species tree), Synergy first clusters 
together the genes or groups of orthologs from previous stages that share 
significant sequence similarity. It then reconstructs a phylogenetic gene tree 
for each of these intermediate groups of orthologs, and uses this tree to 
partition the clusters such that each contains only genes that are descended 
from a single hypothetical gene in the ancestral species corresponding to the 
current stage. Thus, after each stage, Synergy has made a complete orthol- 
ogy assignment and gene tree reconstruction for the complement of genes 
below the corresponding node in the species tree. These are then passed up 
to the next stage. Once Synergy reaches the root of the species tree, it has 
made a full partitioning of groups of orthologs that are descended from a 
single ancestral gene along with a corresponding gene tree for each such 
group. Below we discuss each step in this procedure. 



2.2. Defining orthogroups 

There are two major classes of homology relations between genes (Fitch, 1970). 
Orthologs are genes that share a common ancestor at a speciation event, 
while paralogs are related through duplication events (Fig. 18.1 A and B). 
These are not necessarily simple one-to-one relationships. For example, two 
paralogous genes that resulted from a duplication after a speciation event, are 
both orthologous to the same gene in another species (Fig. 18.1). Conversely, 
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Figure 18.2 Orthogroups and their phylogenetic gene trees. (A) A species tree X 
rooted at the ancestral species x. Only a fraction of the tree is shown. (B, C) A gene tree 
(e.g., P*) represents the evolutionary history of all the genes that descended from the 
gene g* in species x. Each internal node in the gene tree defines a corresponding 
orthogroup (e.g., OG*, OG[, . . .), whose members are the genes below that node in 
the tree. The gene tree can track duplication events (star, panel C, OG2 and OG3). 

when genes are lost in a particular species or lineage, orthology may be a one- or 
many-to-none relationship. 

Such relations are captured by phylogenetic trees (Fig. 18. 2 A— C, 
Section 1). We denote a species tree T where internal nodes (x, y, . . .) 
represent ancestral species, and leaf nodes (a, b, . . .) represent extant species 
(Fig. 18.2A). We denote as g a a gene g in species a. The exact orthology and 
paralogy relations between genes are represented in a gene tree P 
(Fig. 18. IB). The leaves in P are the genes which descended from a single 
common ancestor gene at the root of P. Its internal nodes represent the 
speciation and duplication events that occurred within the course of the 
genes' evolution (Fig. 18. IB). 

We define an orthogroup as the set of genes that descended from a single 
common ancestral gene. An orthogroup OG* is defined with respect to an 
ancestral species x in T and includes only and all of those genes from the 
extant species under x that descended from a single common ancestral gene, 
g*, in x. We therefore define: 

Definition 1. (Orthogroup soundness) 

An orthogroup OG* under the ancestral species x E T is sound if there 
existed a gene g* in x such that for every gene g* E OG*,g* is a descendant 

of gf- 

Definition 2. (Orthogroup completeness) 

OG* is complete if every gene g* that descended from the ancestral gene 
gf is in OG*. 



1 

We assume that gene fusion and horizontal transfer events are rare and that therefore genes are descended 
from single genes, allowing us to represent gene phylogenies as trees. 
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Each orthogroup OG* has a corresponding gene tree P*. The leaves in 
P* are the genes g* E OG* (for every extant species a at the leaves of T 
under x), and its internal nodes denote ancestral genes and the duplication 
events that occurred along OG*'s evolution (Fig. 18. IB). The root of P* 
represents the ancestral gene g*, the last common ancestor of all g* G OG*. 

Since Synergy aggregates orthogroups as it recursively traverses the nodes 
of the species tree (Section 2.5), it will at times be at nodes whose child nodes 
are extant species (e.g., node x in Fig. 18. 2A) and at times be at nodes whose 
children are also ancestral nodes (e.g., node y in Fig. 18.2A). In the former 
situation, the orthogroups in consideration are equivalent to individual genes 
in an extant species. In the latter, they correspond to hypothetical ancestral 
genes in the corresponding ancestral species. Since an orthogroup OG* 
represents the gene g* whether it be at an ancestral or extant node, we will 
subsequently refer to orthogroups and genes interchangeably. 



2.3. Scoring gene similarity 

The common ancestry of homologous proteins suggests that they should retain 
some sequence similarity. The estimate of the evolutionary distance between 
pairs of proteins is the basis for our reconstruction method. Although our 
method can be applied with any method for computing these pairwise 
distances, much of its success depends on these choices. Here, we use a 
measure of distance that examines the evolution of both the amino acid 
sequence of the proteins and the chromosomal organization of genomes. 

When comparing amino acid sequences, we use standard models of 
amino acid evolution. Specifically, our peptide sequence similarity score (d p ) 
between a pair of proteins is based on the JTT amino acid substitution rates 
(Jones et ah, 1992). To compute cP we first globally align two proteins, then 
search for the distance that maximizes the likelihood of substitutions in each 
alignment position. 

To capture the information that genome organization conveys about the 
homology between proteins, our synteny similarity score (cF) quantifies the 
similarity between the chromosomal neighborhoods of two genes 
(Fig. 18.1C). A (preliminary) orthology assignment anchors chromosomal 
regions in two species to one another. Genes that are highly syntenic to each 
other will share many such anchors between their chromosomal neighbor- 
hoods. Since there is currently no agreed-upon evolutionary model of 
genome organization, we compute the synteny similarity score between 
two genes as the fraction of their neighbors that are orthologous to one 
another (Fig. 18.1C). The source of the preliminary orthology assignment 
will be discussed below. 

Both d? and d s are scaled and treated as distances for assessing protein and 
chromosomal evolution between pairs of genes. The protein similarity score 
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scales with evolutionary distance. We scale the synteny score to the same 
range but we do not make any assumptions about its direct evolutionary 
interpretation. Two genes with high similarity have scores close to zero, 
while genes sharing no similarity have scores of 2.0. We combine these two 
measures to identify potentially orthologous genes. 

2.4. Gene similarity graph 

Synergy relies on the precomputed distances between genes to make 
orthology assignments, represented by a gene similarity graph. This is a 
weighted directed graph Q = (V,c), where V are all the individual genes 
in the input genomes, and the edges c represent potential homology 
relations. To generate c, we first execute all-versus-all FASTA alignments 
between all genes in our input (Pearson and Lipman, 1988). 

Since we expect most of the gene pairs to share no common ancestor or 
sequence similarity, we wish to maintain edges only between genes with 
relevant distances, thus helping to guide the algorithm by identifying the 
potential homologies. Once we designate gene pairs that are significantly 
similar, we place an edge between g* in species a and gj 5 in species b if the 
FASTA E- value of their alignment is below 0.1 and either g is the best 
FASTA hit in species b to g* or the percent identity between g* and g is 
above 50% of that between g* and its best hit in b. These parameter choices 
are relevant to the implementation that we employed but not to the 
framework for resolving orthologs that we propose. Our results from 
varying these choices show that these parameters were relaxed enough to 
capture a high proportion of putative homology relations, while at the same 
being restrictive enough to ensure that Synergy runs efficiently. These 
parameters are given as inputs to our implementation. 

Once we place edges between the nodes in the similarity graph, we 
weigh each them by the cP score defined above. While this distance is 
symmetric, the edges are directed, and are placed from the query to the 
target gene based on the direction of the similarity search. Thus, not all 
edges are reciprocal; the significant hits from a given gene g* may not all 
include g* among their significant hits. 

2.5. Identifying orthogroups 

Identifying orthogroups across multiple species amounts to sequentially 
reconstructing the shared ancestral relationships between genes at each 
internal position of a phylogenetic tree. To this end, Synergy (Fig. 18.3) 
recursively traverses the nodes of the given species tree T from its leaves to 

We rely more heavily on the protein similarity described in Section 2.3 than on bitscores or E-values because 
the best "hit" is often not the nearest phylogenetic neighbor (Koski and Golding, 2001; Wall et al., 2003). 
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Synergy Algorithm 
Input: A species tree node x 
Output: A set of orthogroups {OG x } 

if x is an extant species 

(OG x ) «- |g*} 
else 

// Call Synergy recursively 

{OG r } <— Synergy (x.righO 

{OG 1 } <- Synergy (x./e/O 

// step 1: match orthogroups; make putative orthogroups {OG x } 

{OG x } <- MatchOrthogroups( x, {OG r }, {OG 1 } ) 

// step 2: rn^fe £/ze phylogenetic gene tree {P x }for the orthogroups OG x 

repeat 

Choose an unprocessed OG x e {OG x } 

// construct the unrooted phylogenetic tree topology 

P x ^MakeTree(OG x ) 

// now use equation 1 to select the root 

RootTree(Pp 

// break an orthogroup if it does not resolve to a single ancestral gene 

if P x .root g {g x } 

(OG x ,OG x ) <- Break0rthogroup(OG x , Pp 
// update the set of putative orthogroups 
{OG x } <- ({OG x } \ OG x ) U (OG x ,OG x ) 
else 
Mark OG x as processed 
until all orthogroups are processed 
UpdateSimilarityGraphCx, {OG x J) 
return {OG x } 



Figure 18.3 Overview of the Synergy algorithm. The algorithm is initially called with 
the root of the species tree T. 



its root, identifying orthogroups with respect to each node. At each recur- 
sive Stage, Synergy assumes that sound and complete orthogroups and their 
corresponding gene trees are resolved for the lower nodes in the tree. For 
each internal node x E T, Synergy uses the distances between genes (or, 
equivalently, between orthogroups resolved in previous stages) to deter- 
mine the orthogroups {OG x } and reconstruct the phylogenetic gene trees 
{P x } between the member genes of each orthogroup. Once this is com- 
pleted, the set of newly identified orthogroups and their corresponding gene 
trees are recorded. At this point the procedure updates the gene similarity 
graph by replacing the genes in the species below x by orthogroups in 
{OG x }, and the next stage of the algorithm treats these orthogroups as 
genes. When the bottom-up recursion reaches the root of T, every gene g* 
in each species has been assigned uniquely into an orthogroup and located as 
a leaf in the corresponding gene tree. 

We now expand on the details of each step of the procedure. 
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2.5.1. Matching candidate orthogroups 

At each node x of the species tree T, Synergy considers orthology assignments 
for orthogroups pertaining to the species directly below x in the species T 
(denoted y and z). Our goal is to capture as many true orthology relations 
without substantially compromising specificity. As noted above, the 
orthogroups from both y and z are now vertices in the gene similarity graph. 
Synergy begins by matching orthogroups in both x and y into candidate 
orthogroups. We assign orthogroups into the same candidate orthogroup if 
they have reciprocal edges between them and apply transitive closure on these 
reciprocal relations. More precisely, for a pair of orthogroups OG b 
OGj E {OG y } U {OG z }, we have that OG { ~ x OGy (i.e., OG ( and OGy 
are reciprocally connected) if either both OG { — > OGj and OGj — > OG, are in c or 
if there is a third orthogroup OG^ E {OG y } U {OG z } such that OG ; ~ 
x OG^ and OG^ ~ x OGj. This leads to a partitioning of the orthogroups 
from species y and z into equivalence classes under ~ x . Each such equivalence 
class is taken to be a single candidate orthogroup for x. By permitting indirectly 
connected orthogroups, we increase our sensitivity in identifying putative 
orthologs. However, we stop short of defining these candidate orthogroups as 
ordinary connected components in the graph, since this generates candidate 
orthogroups too coarsely. We nonetheless find this partitioning in a linear time 
(in the number of edges) using a standard connected component algorithm. 

This step is similar to many hit-based methods (e.g., COGs; Tatusov 
et ah, 1997). Due to our lenient inclusion policy and the promiscuity of 
edges in the gene similarity graph, candidate orthogroups may contain genes 
(orthogroups) that are related through duplication events that predate x, and 
in fact descend from multiple genes in the ancestral species x. Such viola- 
tions of the orthogroup soundness condition (Definition 1) are handled after 
each candidate orthogroup is arranged into a phylogenetic tree. 

2.5.2. Phylogenetic tree reconstruction 

Given a candidate orthogroup OG*, we reconstruct a phylogenetic tree P 
whose leaves are the orthogroups from y and z that comprise OG 
(Fig. 18.4A). Recall that since the trees {P y } and {P z } were already resolved 
in previous stages, we can treat the root of each of these trees as extant genes 
in the phylogenetic reconstruction. 

When only a pair of orthogroups OGj" and OG^ are matched into the 
candidate orthogroup OG*, there is a clear one-to-one orthology relation, 
making this task trivial: the gene tree would appear exactly as the species tree 
appears at the point x (Fig. 18.4A). When an orthogroup OG* contains 
one-to-many or many-to-many relationships (due to possible duplications 
and/or losses), we reconstruct the tree using the neighbor-joining method 

We could replace neighbor-joining by other phylogenetic reconstruction procedures; our choice was based 
on the relatively efficiency and effectiveness of the neighbor-joining procedure, but this component of our 
procedure can be considered modular to the rest. 
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Figure 18.4 Construction of phylogenetic gene trees. (A) A gene tree P* for a 
candidate orthogroup OG* is constructed by joining the trees P^ and P* resolved in 
Stages y and z. (B, C) When OG* consists of more than two members, there are several 
alternative rootings. In (B), a root is selected that invokes one duplication between 
species y and the root of the tree, such that OG[ and OG2 are paralogs. In (C), the 
rooting suggests a duplication at the root of the gene tree, such that OG* and OG* are 
paralagous with respect to a duplication predating x and OG* is lost after the speciation 
event. If (C) is selected, the orthogroup must be broken. 

(Saitou and Nei, 1987) applied to the distance matrix between the 
orthogroups that comprise OG*. If the leaves of this orthogroup tree are 
singleton genes, then the distances input to this matrix are drawn from the 
pairwise distances described above. The case for distances between prede- 
fined orthogroups from the lower branches in T is discussed below (2.5.5). 
The result of this reconstruction is an unrooted phylogenetic tree, whose 
leaves are the orthogroups that have been matched together. 

2.5.3. Tree rooting 

The resulting unrooted tree contains all of the orthogroup components that 
were matched into the candidate orthogroup. To obtain the exact phylo- 
genetic relationships between these components, the tree must first be 
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rooted. Correct rooting is important since the selected root position may 
also determine whether all of an orthogroup's members descended from a 
single gene in species x or from multiple genes (Fig. 18.4B and C). 

Assuming equal rates of evolution among all the leaves in a tree, a tree's root 
should be approximately equidistant to all the leaves. Given an unrooted tree, 
we compute the leaf-to-root variances for every possible rooting rat an internal 
branch in it, and assign a score to each rooting that is proportional to the 
variance in both amino acid and synteny scores, termed 7T r and G r , respectively. 

Following a gene duplication, one or both of the paralogs are often under 
relaxed selection, and can evolve at an accelerated rate (Lynch and Katju, 2004; 
Ohno, 1970). This conflicts with the above assumption that all branches of the 
tree evolve at an equal rate, and complicates tree-rooting. We therefore 
introduce a preference for root locations that are more likely in terms of 
the number of duplications and losses invoked. For each root position r, we 
compute the number of duplications and losses it implies for each branch 
s below x in the species tree (i.e., either y or z). We denote these as #dupsj, 
and #lossJ, respectively. To estimate the probabilities of such events, we 
assume that they are governed by a Poisson distribution. We define 



(Or = 



n 



P(#dups* = d\#loss s r = l s ) 
;e{y,z} 




where d s and A s are the rates of duplication and loss at the branch s, 
respectively. These rates may either be learned by the algorithm through 
repeated iterations or be based on prior knowledge of the studied lineages. 
We select the root for each candidate orthogroup OG { by combining the 
three scores into a single rooting score p r , reflecting the relative importance 
of each score. We select the rooting that maximizes: 

p r (OGj) = — ocTir — j6cr r + yco r (18.1) 

where a, /?, and y are constants specifying the relative contribution to the 
rooting score of peptide similarity, synteny similarity, and the likelihood of 
the invoked duplications and losses. 

2.5.4. Breaking candidate orthogroups 

Once a rooting r for an orthogroup tree P* is chosen, we may find that the 
root of P* no longer represents a single gene as the last common ancestor of 
all the genes present, but rather an earlier duplication event from which two 

The Poisson model assumes that these events occur as a memoryless process. This is likely true for most 
duplications and losses, a notable exception being loci with tandemly duplicated genes, where subsequent 
duplications and losses may occur at higher rates. 
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ancestral genes were derived (Fig. 18.4C). This violates Definition 1, and 
we must therefore split the orthogroup's components at the root of its 
current tree P*. This situation frequently occurs when orthogroups are 
paralogous with respect to a duplication event that predates x. 

This step allows us to be very permissive with the edges we include 
between genes in the gene similarity graph Q and in how we match 
candidate orthogroups. By admitting more edges, we include many spurious 
ones, but we also include edges that capture the many-to-many relations 
that may arise from duplications, ensuring that our orthogroups satisfy 
Definition 2 of orthogroup completeness. If the spurious edges cause 
nonorthologous orthogroups to be matched, an accurate rooting will sub- 
sequently lead the procedure to partition the candidate orthogroup into 
separate orthogroups. Synergy iterates this until each orthogroup represents 
a single ancestral gene and no orthogroups need to be partitioned. 

2.5.5. Updating the gene similarity graph 

Once we constructed orthogroups at the ancestral node x, we no longer need 
to consider the orthogroups in the species below x individually. We avoid 
doing so by removing vertices in the gene similarity graph that correspond to 
orthogroups in {OG y } and {OG z } and introducing new vertices that corre- 
spond to the newly created orthogroups in {OG x }. The edges incident to the 
new vertices are acquired by taking the union of the edges that were incident 
to its constituent orthogroups (or genes). To weight these new edges, we 
recall that each new orthogroup represents the root of a tree. Thus, we can 
use the standard neighbor-joining distance updating rule in the order speci- 
fied by the topology of orthogroups' corresponding gene trees (Fig. 18.5). 
When one of the distances in question is not defined in the original similarity 
graph, we use the maximal distance value. 

The edges in this updated similarity graph can always be traced to one 
(or more) edges between extant genes in the original similarity graph. 
However, reciprocal edges between two orthogroups (that might lead 
them to be merged into the same orthogroup in subsequent iterations) 
may originate from two different pairs of extant genes that are assigned to 
the two orthogroups. Thus, our matching criteria is able to capture non- 
trivial paths in relating the extant genes. 




3. Evaluating Orthogroup Quality 

Many comparative genomics methods were first applied in studies of 
Ascomycota fungal genomes. These organisms are phenotypically variable in 
substantial ways, and their genomes have undergone major changes, includ- 
ing an ancestral whole-genome duplication (WGD) which resulted in the 
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Figure 18.5 Schema of how edge weights are updated in the gene similarity graph. 
Edge weights between two nodes in the updated graph are calculated by applying the 
neighbor-joining distance update rule to the constituent nodes within the orthogroups. 
Dashed lines denote edges induced from updated graphs while solid lines denote edges 
present in initial graph Q. (A) Without loss of generality, the distance between a 
singleton node g m and a node representing the orthogroup OG^ is computed according 
to the tree structure of P fe . (B) To obtain d km , the pairwise distances between g m , OGj, 
and gy are used. (C) During each step down the tree, the recursive distance updating 
procedure expands the current orthogroup, indexed as OG fe . This process is repeated 
until the leaf nodes of P^ are reached. In practice, the distances are cached each time 
they are calculated to avoid repeated computation. 



retention hundreds of paralogous genes in subsequent lineages. They thus 
offer an excellent opportunity to study evolutionary relations between genes. 
We applied our approach to identify orthologs and reconstruct gene 
trees in over a dozen Ascomycota genomes (Fig. 18.6). To assess the quality of 
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Ascomycota fungi 



Hemiascomyco ta 




rti 



S. cerevisiae 

S. paradoxus 

S. mikatae 
S. bayanus 

— C. glabrata 
-S. castellii 

-K. lac t is 

-A. gossypii 




K. waltii 

D. hansenii 
— C. albicans 



Y. lipolytica 

—N. crassa 
—F. graminearum 
— M. grisea 
A. nidulans 



Euascomycota 



S. pombe 



A rcheascomyco ta 



Figure 18.6 Species tree of Ascomycota fungi included in this study. Additional classi- 
fications for Hemiascomycota (top), Euascomycota (middle), and Archeascomycota (bottom) 
clades, the WGD event (star), and post-WGD species (darker shade). The branch 
lengths are proportional to the estimated evolutionary distances. Species in bold were 
included in our evaluation of orthogroup accuracy by comparison to YGOB 
(Section 3.2). 



our results we relied on three measures. First, we estimated the robustness of 
the orthology assignments based on a new jackknife procedure. Second, we 
compared our predictions to a manually curated gold standard and to those 
of another more traditional hit-based approach. Third, we measured our 
performance on a simulated dataset of orthogroup evolution. We used a 
smaller subset of six genomes (Fig. 18.6) to assess the quality of our predic- 
tions, and then expand to additional genomes for further biological analysis. 



3.1. Fungal orthogroup robustness 

To empirically measure Synergy's robustness and to evaluate our confidence 
in each orthogroup, we developed a jackknife-based approach. We system- 
atically and repeatedly excluded ("held out") different portions of the data, 
and measured the robustness of orthogroup assignment to (1) the choice of 
species included and (2) the accuracy of gene predictions within each 
species. We tested the soundness and completeness of the identified 
orthogroups under each type of perturbation. 
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A sound orthogroup (Definition 1) contains only the genes that des- 
cended from a single common ancestor and thus its genes should not 
"migrate out" of it in the holdout experiments. To test this, we count the 
number of orthologous gene pairs (g ; -, g k ) in an orthogroup OG/ that 
remained orthologous in our holdout experiments H. We compute the 
soundness bootstrap score rf { for each orthogroup OG { by counting the 
fraction of orthology assignments that remained constant across each hold- 
out experiment h E H: 

s |{( gj , & ) e OG,-| h( gj , g k ) = OG,-(g,, & )}| 

TJi = (18.2) 

it N V ) 

where h(gy, g^) and OG;(gy, g^) specify the last species in the tree in which gy 
and g£ share a common ancestor in the holdout experiment h and the 
original orthogroup, respectively (this is equal to — 1 if gy and gj, are not 
members of the same orthogroup), and AT is the number comparisons made 
across all holdout experiments; |H| • |((OG,- | ((OG,- | — l))/2). 

A complete orthogroup (Definition 2) contains all the genes that des- 
cended from a single common ancestor, and thus new genes should not 
"migrate into" the orthogroup in the holdout experiments. We use a similar 
formula to obtain the completeness confidence score rf { , except we count 
the number of pairs of nonorthologous genes (g,g fe ),g. E OG/, g^OG/ 
that became orthologous in each holdout condition h: 

|H|.| g/ eOG„g^OG,| v 

Since pairs of genes that share no protein sequence similarity are highly 
unlikely to be considered orthologous in h, we restrict our tests to gene pairs 
that can be loosely regarded as similar (E < 0.1), rendering this task com- 
putationally feasible. 

We compute the confidence measures, rf { and rf { , for both species- and 
gene-holdout experiments, generating four measures of robustness for each 
orthogroup. For the gene-holdout experiments, we set the probability of 
hiding out each gene at 0.1, and performed 50 holdout experiments. We 
performed the branch-holdout experiments by removing each branch in the 
tree separately once (Fig. 18.6), resulting in 31 separate holdout 
experiments. 

Of the nonsingleton orthogroups identified, 79% had all four confidence 
values above 0.9% and 99% obtained a confidence value above 0.9 in at least 
one class of experiments (Fig. 18.7). Perturbations to gene content were 



We must account for the fact that some assignments are expected to change when genes within an 
orthogroup are among those hidden. 
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Figure 18.7 Cumulative distributions of confidence scores for orthogroup soundness 
and completeness under species and gene holdout experiments. Most orthogroups are 
robustly sound and complete to both types of perturbations. 



more disruptive than to species, and soundness was more robust than 
completeness (i.e., perturbations introduced more new "incorrect" orthol- 
ogies than loss of "correct" ones). 

As expected, orthogroups exhibiting higher frequencies of duplication 
and loss events tended to be most sensitive to such perturbations, although 
Synergy's performance was surprisingly robust for even those orthogroups. 
Lack of such duplication and loss events significantly correlated with higher 
confidence values (P < 10 in all four measures). Overall, Synergy was 
remarkably robust to perturbations in the species phylogeny or noisy gene 
predictions. 



3.2. Comparison to curated resource 

We next assessed how Synergy's predictions align with the assignments of a 
manually curated gold standard, the Yeast Gene Order Browser (YGOB; 
Byrne and Wolfe, 2005). YGOB contains orthologies for six of the species 
we investigate based on sequence similarity, chromosomal alignment, and 
intensive manual curation. This resource is limited by its assumption that 
the WGD is the only duplication event among this lineage, and relies 
predominantly on synteny to assign orthology relations. Nonetheless, it 
provides a "gold standard" of orthology and paralogy relations which we 
use to evaluate our automated methods. 

We found that Synergy's automatic predictions conform very well to 
those of the YGOB "gold standard" for the relevant species (Fig. 18.8). 
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Figure 18.8 Comparison of Synergy and InParanoid (Remm et ah, 2001) predictions 
to the gold standard of YGOB (Byrne and Wolfe, 2005). The matrix displays the 
sensitivity (lower left cells) and specificity (upper right cells) of orthology assignments 
in YGOB that were automatically identified by Synergy (top number) and InParanoid 
(bottom, italicized) for each pair of species. Synergy achieved higher sensitivity at 
identifying orthology relations than InParanoid, albeit at apparently lower specificity 
rates. However, because YGOB was designed specifically to study the WGD in these 
yeast species using syntenic relations, Synergy may include many orthology assignments 
that were not detected by YGOB due to lack of chromosomal evidence, making this an 
underestimate of Synergy's specificity performance. The diagonal shows the percent of 
paralogues reported by YGOB that were detected by Synergy and InParanoid. 



For example, Synergy was able to automatically identify over 80% of the 
orthology assignments between all pairs of species. More significantly, 
Synergy was able to resolve at a similarly high level of accuracy the precise 
paralogy relations within orthogroups where both paralogous copies were 
maintained in at least one species following the whole-genome duplication. 
This task is challenging since determining which pairs of genes that were 
retained in duplicate are orthologous requires disambiguating between 
genes sharing high degrees of sequence similarity. 

We also compared the quality of Synergy's paralogy assignments to that 
of InParanoid (Remm et ah, 2001), a hit-clustering method designed to 
identify paralogous relations between genes within genomes. Synergy iden- 
tified more known paralogs dating to the WGD than InParanoid did 
(Fig. 18.8). Unlike InParanoid, Synergy also resolved orthologies and 
gene trees for multiple species simultaneously. 

Synergy also showed greater sensitivity than InParanoid when identify- 
ing orthology relations, albeit potentially at the cost of lower specificity. 
Some of the reduced specificity may be the result of a limitation of our gold 
standard: While YGOB's annotations are highly accurate, their methodol- 
ogy is limited by two assumptions: (1) all duplication events originated in 
the WGD and thus orthology is at most a 2-to-l relationship and (2) gene 
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order is nearly always conserved and thus can be used as the primary source 
of evidence for shared ancestry. These assumptions relegate a greater por- 
tion of genes as singletons without orthologs, and a far smaller proportion of 
YGOB's orthologous loci are ancestral to all of their species than those 
that Synergy identified (62% vs. 82%). We therefore believe that many of 
the orthology assignments reported by Synergy but not by YGOB 
(or InParanoid) (Fig. 18.8, green cells) are likely to be correct assignments. 
To estimate the contribution of including synteny in our approach, we 
reran Synergy on these data while ignoring the genes' locations. We found 
that synteny plays a relatively minor role in predicting a genes' correct 
orthologs, but contributed substantially to reconstructing the correct gene 
trees. For example, over 200 duplication events were detected at the root of 
the sensu stricto species when ignoring synteny, most of which should have 
been traced to the WGD. The contribution of synteny information to 
orthology prediction may be most noticeable in cases where genes are under- 
going exceptionally slow or fast rates of evolution, as is often the case between 
paralogs undergoing gene conversion or neofunctionalization (Kellis et at, 
2004a; data not shown). It is here that synteny can help the most when 
deciding how to root the gene tree in Stage 2 of the algorithm (Section 2.5.2). 



3.3. Simulated orthogroups 

To obtain an objective measure of Synergy's accuracy, we simulated 
orthogroup evolution including multiple rounds of speciation events and 
with prespecified rates of gene duplication and loss. At each stage, we used 
the SEQ-GEN program (Rambaut and Grassly, 1997) to simulate the evolu- 
tion of protein sequences using the JTT model of amino acid substitution 
(Jones et al, 1992). In order to make these simulations as true to fungal 
protein sequences as possible, we initiated the simulations with varying 
numbers of randomly drawn sequences from Saccharomyces cerevisiae. For the 
purposes of this benchmark, we ignored the chromosomal ordering of the 
simulated sequences, since to the best of our knowledge there is no general 
agreed-upon model for chromosomal evolution. As a result, this test evaluates 
Synergy's performance when no synteny information is considered. 

We parameterized our simulations as follows: Using a balanced species 
tree topology containing 16 species, we gave each orthogroup a probability 
of 0.1 of incurring a duplication or loss along every branch of the species 
tree. These duplication and loss rates are relative high, but we were inter- 
ested in examining how well Synergy performs under such volatile condi- 
tions. We specified the rates of amino acid substitutions between orthologs 
by the branch lengths in the simulated gene trees. These lengths were drawn 
from an exponential distributed with a mean of 0.36 (approximately the 
mean branch length in the fungal species tree we used). 
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Synergy accurately detected over 85% of the orthologous relations in our 
simulated datasets of various sizes (Fig. 18.2). Further, its specificity was 
remarkably high — nearly 99% — despite the presence of many of paralogs in 
the genome from which the simulated sequences were originally drawn. This 
sensitivity may have been further improved had we included chromosomal 
order into these simulations, allowing Synergy to predict paralogs more 
robustly. Importantly, we found no significant trend suggesting that the num- 
ber of sampled sequences affected Synergy's overall performance in these 
simulations. 

While we recognize that the implications of such benchmarks should be 
carefully interpreted, we believe that these simulations accurately reflect 
Synergy's strong performance on data that is based on a reasonable model of 
fungal sequence evolution. 




4. Biological Analysis of Gene Histories 

A high-quality set of orthogroups and gene trees (from Synergy or 
future improved tools) opens the way for comprehensive analysis of the 
relation between gene evolution and function. These require a host of 
genomics resources and tools. The rich genomics datasets collected for 
5. cerevisiae (and increasingly for other yeasts) allow for such successful 
analysis. Here, we highlight several key approaches with which to tackle this 
challenge by available genomic resources and datasets and the evolutionary 
categorization of orthogroups. 



4.1. Defining orthogroup categories 

Each orthogroup 's gene tree contains its own set of characteristics that describe 
its history. We can use these characteristics to categorize them. For instance, 
the gene tree associated with the orthogroup in Fig. 18. 9 A has the same 
topology as the species tree and exhibits no duplications or losses throughout 
its history. In contrast, the orthogroup in Fig. 18.9B contains a duplication that 
led to two paralogous loci (IFH1 and CRF1), one of which was subsequently 
lost in Candida glabrata (CRF1), as well as a loss in the lineage leading to the 
Euascomycota clade. These features can be used to categorize genes into sets. For 
example, we say that the SEN34 gene is "uniform" because it contains a 
uniform history with no duplication and loss. It can also be described as 
"persistent", because it contains at least gene in all of the species. IFH1 and 
CRF1 are not included in these categories, since they do not have a uniform 
history within this phylogeny, nor are they represented in at least one copy in 
all of the genomes. As we will describe, such categorization allows for further 
analysis with respect to how gene functions and histories might be related. 
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Figure 18.9 Gene trees and copy-number variations. (A) A uniform orthogroup that 
contains the S. cerevisiae gene SEN34. The topology of the gene tree (left panel) is 
identical to that of the species tree. Genes in the orthogroups tree are named on right, 
next to the four letter abbreviated species names. (B) Orthogroup containing the 
S. cerevisiae genes CRF1 and IFH1. The gene trees topology (left panel) differs from 
that of the species tree and shows a single duplication event (star) and two loss events 
(strikes). (C) The extended phylogenetic profile of the orthogroup in (B) summarizes 
the number of genes in the orthogroup at each extant and ancestral species in the tree 
(numbered boxes). 
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Another major type of history-based categorization reflects the particular 
"age" of a gene. Many genes only have orthologs tracing back to a specific 
point in the phylogeny, where they seem to "appear" (e.g., Fig. 18.10A). 
These may indicate points of genomic innovations or elevated mutation 
rates among these loci. Genes that appear at the same branch of the 
phylogeny can be categorized into the same set. Similarly, those that are 
"ancestral" to the phylogeny are an additional gene set. Note that ancestral 
genes are not always persistent (e.g., Fig. 18.9B). To ensure that appearing 
genes are not resulting from orthology mispredictions, we can conduct 
more sensitive remote homology searches to determine if a gene has 
homology that are more distant than those denoted by its reconstructed 
tree (Durbin et al, 1998) (Table 18.1). 

4.2. Singletons and ORF predictions 

We define genes that appear solely in an individual species as "singleton" or 
orphan genes. These genes have no orthologs and may indicate lineage- 
specific genes that perform a function unique to that species or, more often, 
they are mispredicted ORFs. In this way, orthogroups assignments can be 
used to refine genome annotations. We discarded singleton ORFs for most 
of our analyses, but recent works by others (e.g., Fedorova et al., 2008; 
Kasuga et al., 2009; Khaldi and Wolfe, 2008; Li et al., 2008) have explored 
the origins and roles of these genes in different species. 

4.3. Gene sets and orthogroup projections 

To study the relations between gene history and gene function, we must 
first gather functional categories describing the genes' roles within an 
organism. These gene sets may be derived from experimental data, domain 
composition, or homology to known genes. Some of the main resources of 
S. cerevisiae gene sets are (the number of gene sets is in brackets): the Gene 
Ontology (Ashburner et al, 2000) (GO) hierarchy (1794), the Kyoto 
Encyclopedia of Genes and Genomes (Ogata et al., 1999) (KEGG) (87), 
the BioCyc database (Karp et al., 2005) (107), the MIPS database of 
manually curated protein complexes (Mewes et al., 2002) (1022), targets 
genes bound by various transcription factors (Harbison et al., 2004) (310), 
targets genes harboring a given ds-regulatory element in their promoters 
(Harbison et al., 2004) (70), and targets of RNA binding proteins (Gerber 
et al., 2004) (5). Additional resources which are important for the assess- 
ment of the relation between gene history and functions include datasets 



These 3395 gene sets were used in most of the analyses below, and in particular to construct transcriptional 
modules. 
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Figure 18.10 Appearing genes and extended copy-number variation. (A) An appearing 
orthogroup that contains the S. cerevisiae gene IME1. The gene tree topology (left, black 
lines) differs from that of the species tree (right) as it is only supported by genes from the 
clade spanning K. waltii and S. cerevisiae. The extended phylogenetic profile (EPP, center, 
numbered boxes) shows the gene copy number for all the species. The copy-number 
variation profile (ECVP, right, numbered boxes) indicates the changes in gene copy 
number. An increase (+ 1) is placed at the first ancestral species where this orthogroup is 
traced to (appears in). The S. castellii ortholog in this orthogroup was lost (— 1, blue strike). 
(B) A persistent orthogroup that contains the S. cerevisiae genes RPS19A and RPS19B, 
which encode proteins of the small ribosomal subunit. The gene tree topology (left) differs 
from that of the species tree (center, right) as it includes both duplication (star) and loss 
(strike) events. EPP and ECVP for the orthogroup are show the center and right panels, 
respectively. The ECVP indicates an increase in copy number (+1) due to gene duplication 
at the WGD and along the branch leading to S. pombe (stars). One of the WGD paralogs 
was subsequently lost in C. glabmta (— 1, strike). Despite the loss event, the orthogroup 
contains at least one member gene in each extant species. 
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Table 18.1 Summary of terms used to characterize orthogroups 



Uniform No retained duplications or losses across all lineages 

Persistent Retain at least one ortholog in all lineages 

Volatile Much greater number of duplication and loss events than 

expected by chance 
Ancestral Present at the last common ancestor of all the species studied 

Appearing Could only be traced up to a certain point in the phylogeny; 

the opposite of ancestral 
Singleton Individual genes that appear at a leaf node in the phylogeny, 

having no orthologs identified in any other lineage 



measuring genes controlled by the SAGA and/or TFIID transcription 
complexes (Huisinga and Pugh, 2004), genes with and without TATA 
box control (Tirosh et aL, 2006), genes with large levels of expression 
variation between yeast species (Tirosh et aL, 2006), genes with high and 
low levels of noise in protein abundance (Newman et aL, 2006), haploin- 
sufficient genes (Deutschbauer et aL, 2005), genes whose overexpression 
reduces fitness (Sopko et aL, 2006), and genes belonging to complex cores, 
attachments and modules based on high-throughput assays (Gavin et aL, 
2006). To streamline our analysis and build on the extensive expression data 
available for S. cerevisiae, we used these gene sets to determine a hierarchy of 
transcriptional modules following the procedure presented by Segal et aL 
(2004). Each module contains functionally related genes with coherent 
expression patterns. We applied this procedure to the 3395 gene sets 
described above and a compendium of 1216 arrays and followed it by 
manually selecting which modules to use in the hierarchy (for full details 
see Wapinski et aL, 2007a). 

These gene sets allow us to analyze orthogroups in the context of their 
cellular functions, regulatory relations, and condition-dependent expression 
programs. We can readily create orthogroup sets from these gene sets by 
assigning orthogroups to sets according their genes' annotations. We then 
use Fisher's exact test to measure the statistical significance of the overlap 
between orthogroup sets {e.g., uniform orthogroups) and functional cate- 
gories {e.g., meiosis orthogroups). Significant enrichments allow us to 
evaluate the global constraints that influence the evolutionary trajectories 
of genes. 

For instance, when applying this approach to our Ascomycota 
orthogroups we found that the ancestral orthogroups are strongly enriched 
in S. cerevisiae genes that are essential; 1008 of 1047 genes essential for 
growth in rich YPD medium are ancestral {P — 1.3 X 10 ). On the 
other hand, the clade spanning S. cerevisiae and K. waltii is marked by 
appearing orthogroups that contain S. cerevisiae genes whose annotations 
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are significantly enriched for meiosis and sporulation genes (51/166 sporu- 
lation genes, P = 2.49 X 10 ), including the master meiosis regulator 
IME1. 

Notably, we can extend this functional analysis from gene sets to 
interaction networks. Previous studies have assembled physical and genetic 
interaction networks from several existing manually curated and high- 
throughput data sources (Reguly et ah, 2006). The networks can be repre- 
sented as graphs where the genes or proteins are nodes and an edge is placed 
between interacting genes or proteins. As discussed below, we can annotate 
such networks with the evolutionary categories (e.g., uniform, persistent, 
appearing) and use them to test the relation between evolutionary history 
and gene function. 

4.4. Copy-number variation profiles 

The orthogroup categories described above provide a concise but simple 
measure of a gene's history, but obscure some of the finer details. For 
example, two orthogroups can both be persistent, but include very different 
patterns of gene duplication and loss, which are represented in the associated 
gene trees. To capture this, we expand on the well-known phylogenetic 
profiling approach that considers the profile of species in which each gene 
is present or absent (Marcotte et ah, 1999). 

We define an extended phylogenetic profile that includes the number of 
gene copies present at each extant and ancestral species in the phylogeny 
(Fig. 18.10). From these profiles, we can readily derive an extended copy- 
number variation profile (ECVP) that measures that gene copy number 
changes through duplications and losses. We compute ECVPs by inspecting 
the extended phylogenetic profile of an orthogroup and counting the 
number of duplications and losses that occur along each index of the species 
tree. We subtract the number of losses from the number of duplications at 
each index to generate the ECVP, thus summarizing these events in a 
numerical vector that is amenable for further analysis (Fig. 18.10, right). 
We increment this copy-number variation profile at the last common 
ancestor identified for the orthogroup, indicating its age. 

4.4.1. ECVPs within orthogroup classes 

Using ECVPs we can ask whether genes that share a function also share 
similar (coherent) histories. To assess the coherence in gene copy-number 
variation across a class of orthogroups, we first calculated the class centroid 
ECVP by averaging the ECVPs from all the orthogroups belonging to a 
class. This centroid is then applied to estimate the degree of deviation 
between the orthogroups belonging to a class by summing the LI distance 
from each of the class orthogroups to it. We compare this deviation to that 
of 10,000 randomly assigned orthogroup classes, each containing the same 
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number of ECVPs. The fraction of times the deviation is equal to or less 
than that of the orthogroup class is the measure (P- value) we use to evaluate 
the coherence of that class. Since copy-number variation occurs at each 
individual branch of the species tree, we similarly define a coherence profile 
for an orthogroup class by evaluating the copy-number variation coherence 
for each position along the species tree. 

In our catalog, functional constraints on copy-number variation manifest 
in remarkably similar patterns of specific duplications and losses in function- 
ally related orthogroups. To show this, we compared the extended copy- 
number variation profiles (ECVPs, which track these events) among sets of 
orthogroups harboring functionally related 5. cerevisiae genes. We found that 
functionally related orthogroups, in particular those related to growth, often 
show coherent profiles of duplication and loss events (Fig. 18.11), indicating 
that such events occur in concert for functional counterparts. Despite this 
general pattern, some classes of orthogroups, especially those related to 
stress, do not show such coherence: these are often related to known sources 
of phenotypic variation between different yeasts (e.g., the cell wall organi- 
zation and biosynthesis class; Fig. 18. HE) or to very general functions that 
would not be expected to have shared evolutionary constraints (e.g., oxi- 
doreductases). Classes of orthogroups defined by genes with a common 
regulatory mechanism (i.e., shared os-regulatory motif (Harbison et ah, 
2004), transcription factor (Harbison et ah, 2004), or RNA binding protein 
(Gerber et ah, 2004)) are typically not coherent, suggesting that regulatory 
mechanisms do not impose strong evolutionary constraints on copy-num- 
ber variation. Coherence is not merely a reflection of the prevalence of 
uniform orthogroups, since these observations are maintained even when all 
uniform orthogroups are omitted from the analysis. 



4.4.2. ECVPs within interaction networks 

We can also use ECVPs to assess the evolutionary constraints imposed by 
physical and genetic interactions. To test the relation between proximity in 
interaction networks and similarity in copy-number variation, we first 
computed the difference (using the LI distance) between the ECVPs for 
each pair of proteins in the network, ignoring pairs that belong to the same 
orthogroup (hence sharing the same profile). Next we averaged these 
differences among all proteins within a given radius in the network. To 
determine whether these averages were significant, we repeated this com- 
putation by shuffling profile assignments to proteins in the network 1000 
times, obtaining the expected range of average differences between pairs of 
proteins in the network for each radius. 

Indeed, when we examined the ECVPs of pairs of orthogroups contain- 
ing (nonparalogous) S. cerevisiae genes, we found that those that are nearby 
in either a biochemical network or a genetic network tend to have 
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Figure 18.11 Coherent evolution of functionally related proteins. (A) Phylogenetic 
coherence of the tRNA spliceosome orthogroup class. The set of S. cerevisiae tRNA 
spliceosome genes (MIPS) was mapped to the set of orthogroups that contain these 
genes (black arrows, middle panel). Some orthogroups (e.g., #28162) contain multiple 
paralogs from the gene set. The ECVPs of all the orthogroups in the set are compiled 
into a matrix (left panel). Each row denotes one profile, and each column the copy 
number changes in one species (red, increase; blue, decrease; black, no change). The 
species (extant and ancestral) are ordered according to the order of the nodes in the 
species tree (top). The bottom row shows the coherence score for each column (purple, 
coherent), as evaluated by comparing the number of events to the distribution of events 
in the specific node within a random set of orthogroups of the same size. The overall 
significance of the coherence is reported in a P-value. (B-E) Phylogenetic coherence of 
the protein biosynthesis, Mitosis, 20S proteasome, and cell wall organization and 
biogenesis orthogroup sets. Copy-number variation coherence is presented as in (A). 
The copy-number changes observed in the protein biosynthesis (B), Mitosis (C), and 
20S proteasome (D) orthogroup classes are coherent. Those in the cell wall organiza- 
tion and biogenesis orthogroup set are not coherent (E). Orthogroup classes are 
projected from the GO gene classes. 
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Figure 18.12 Coherent evolution of interacting proteins. Distance of genes in bio- 
chemical (A) and genetic (B) interaction networks (x-axis) is plotted versus the average 
difference between the ECVPs of pairs of genes of that distance or less in each network 
(y-axis). Pairs of paralogous genes are excluded from the computation of the averages. 
Black lines show the 1%, 5%, and 50% of the distribution of average distances from 
repetitions of this computation in networks with the same topology obtained by random 
reshuffling of the gene to profile associations. In both cases, similarity in ECVPs 
inversely scales with distance in the interaction network. Each network combines 
literature-curated results and high-throughput measurements. Similar results are 
obtained when only one source of data is used (data not shown). 



significantly similar patterns of gene gain and loss (Fig. 18.12). The similarity 
decreases as the distance in interaction networks increases. 

These findings extend the well-established observation that functionally 
related or interacting genes tend to show correlated co-occurrence (and 
coabsence) across species. However, unlike conventional phylogenetic 
profiles the ECVP approach examines the full phylogenetic history of 
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each gene. It therefore allows us to recognize that the persistent orthogroups 
related to tRNA splicing have coherent patterns duplication and loss while 
the persistent ones related to cell wall organization and biogenesis do not 
(both have an identical conventional phylogenetic profile). 

4.4.3. Copy-number volatility 

Orthogroups show wide variation with respect to the number of duplica- 
tion and loss events: many are uniform with no such events (e.g., 
Fig. 18.9A), while many others are highly dynamic with many events 
(e.g., Fig. 18.9B). To benchmark the volatility in gene copy number for 
each orthogroup, we used the estimated rates of duplications and losses 
along each branch of the species tree to calculate the log-probability of the 
observed number of such events in each orthogroup, assuming that they 
occur independently according to a standard Poisson distribution. This 
statistic is used as a measure of volatility for each orthogroup. We compare 
this volatility metric to those of 10,000 hypothetical orthogroups with 
randomly generated duplications and losses (based on the empirical rates). 
We label orthogroups whose volatility deviates more than three standard 
deviations from the mean of the random distribution as being significantly 
"volatile." We find that the distribution of these events is inconsistent with 
a uniform rate of duplication and loss across orthogroups, as can be seen by 
comparing log-likelihood ratios for the observed occurrences with those 
expected by chance (Fig. 18.13). 

We analyzed the 1018 uniform and 313 volatile orthogroups at the 
opposite extremes of this copy-number volatility scale, and found that 
they show diametrically opposed patterns with respect to a wide range of 
biological properties defined in S. cerevisiae (Fig. 18.13). For example, 
volatile orthogroups are enriched (P < 10~ ) for functional (GO) 
(Ashburner et ah, 2000) categories reflecting peripheral transporters, recep- 
tors, and cell wall proteins and genes that participate in stress responses. By 
contrast, uniform orthogroups are enriched (P < 10~ ) in genes involved 
in essential growth processes; genes residing in the nucleus, nucleolus, 
mitochondrion, endoplasmic reticulum, and Golgi apparatus. This evolu- 
tionary dichotomy is also aligned with the transcriptional program of 5. 
cerevisiae, as reflected by regulatory modules (Section 4.3). Growth and cell 
cycle modules are overwhelmingly enriched for uniform and persistent 
orthogroups (e.g., modules for ribosome biogenesis, ER protein modifica- 
tion and targeting, morphogenesis). By contrast, development, stress, and 
carbohydrate metabolism modules are strongly enriched for volatile 
orthogroups (e.g., redox— detox, mating and filamentous growth, reserve 
carbohydrates, trehalose). There are a few notable exceptions to this general 
trend, including the proteasome and mitochondrial ribosome modules, that 
have stress-like expression pattern but are enriched for uniform 
orthogroups. The uniform orthogroups are enriched in S. cerevisiae genes 
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Figure 18.13 A functional dichotomy of uniform, persistent, and volatile 
orthogroups. (A) Orthogroup volatility. The histogram shows the number of 
orthogroups in each bin of volatility scores. The expected distribution when sampling 
random orthogroups from the evolutionary model is shown as a black line. Uniform 
orthogroups (leftmost blue column) are the lowest scoring orthogroups. Persistent 
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that are essential for viability in YPD (405/1047 essential genes are in 
uniform orthogroups, P < 2.59 X 10 ), suggesting that not only are 
such genes rarely lost in other species (as would be expected), but that 
they are rarely duplicated. This may indicate a limit on the ability to 
maintain duplicate copies of essential genes or a latent functional redun- 
dancy between duplicate genes, making yeast genes with paralogs less 
essential overall. The former hypothesis is consistent with the observation 
that volatile orthogroups are enriched in genes whose expression changes 
significantly in response to many single-gene knockouts (Hughes et al, 
2000), genes with noisy levels of protein abundance within S. cerevisiae 
(Newman et ah, 2006), and genes with variable RNA expression across 
species (Tirosh et ah, 2006). On the other hand, the uniform orthogroups 
show complimentary attributes and are enriched in genes whose expression 
is largely unchanged in the same systems. 

These results highlight a general bipolar principle in which certain types 
of genes readily undergo duplication and loss, while others rarely tolerate 
such events. Copy-number variation in stress responsive genes may not only 
be tolerate but beneficial, allowing fungal cells to adapt to changing ecolog- 
ical niches. On the other hand, genes essential for cell growth, including but 
not limited to those necessary for the operation of intricate multiprotein 
complexes (Papp et ah, 2003), cannot readily tolerate such noise and tend 
not to evolve by gradual duplication and loss. The correspondence we 
observe extends to orthogroups with moderately low- and high-volatility 
scores (and not only to extreme ones); these groups show similar patterns to 
the uniform and volatile orthogroups, suggesting a general principle 
imposed on the vast majority of genes in the genome. Overall, the evolu- 
tionary dichotomy in the rate of duplication and loss aligns closely with a 
clear bipolarity in gene function, transcriptional program and noise level in 
expression levels across cells, strains, and species (Newman et ah, 2006; 
Tirosh et ah, 2006). These features likely reflect similar functional con- 
straints on the amount of gene products available in the cell under different 
conditions. 



orthogroups usually receive low scores as well (blue columns). All orthogroups with 
a score above three standard deviations from the expected mean are volatile (red 
columns). (B) Gene class annotations enriched in uniform, persistent and volatile 
orthogroups. Annotations that were significant (at most P < 10~ 3 after FDR correc- 
tion, purple significance) among either uniform, persistent or volatile orthogroups are 
shown (the size of each annotation is shown on the left). The functional and mechanis- 
tic dichotomy between volatile and nonvolatile orthogroups largely reproduces along 
the full range of volatility scores (columns — bins of orthogroups with similar volatility 
scores; Rows — significant annotations. Yellow/blue higher/lower relative enrichment 
compared to the expected enrichment in the class). 
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5. Analysis of Paralogous Genes 

There are two basic processes by which gene duplications can give rise to 
functional innovation: (Conant and Wolfe, 2006), neofunctionalization, when 
one copy of the gene acquires a new functional role, and subfunctionalization, 
wherein a gene's functional roles are divided between paralogs. A catalog of 
evolutionary histories and functional annotations provides an outstanding 
resource with which to study the degrees to which each of these processes 
have occurred. We consider two measures of functional divergence between 
paralogs: (1) the degree of similarity in functional annotation and (2) the degree 
of similarity in (genetic or physical) interactions. 



5.1. Estimating functional divergence between 
paralogous genes 

To estimate the functional divergence between a pair of paralogous genes, 
we consider the gene set annotations assigned to each paralog. Ordinarily, 
we assume that paralogs have conserved their functions if they are both 
contained within the same functional annotation. Conversely, we estimate 
that paralogs have diverged with respect to a given function if one of them is 
not annotated in the same gene set as the other. For each gene class, we 
calculated the fraction of paralogous pairs that are retained within the class. 
To avoid confounding factors, we studied only cases in which both paralogs 
had been annotated and in which the annotation had not been inferred 
solely from sequence similarity (i.e., computationally). In cases where the 
gene sets are organized in an annotation hierarchy (e.g., GO), we regard a 
pair of genes as functionally diverged only if both genes are assigned to at 
least one annotation class and they are not both assigned to the most specific 
annotations of either of the two genes. 

When applying this approach to 5. cerevisiae gene sets we find that 
paralogous pairs rarely migrate between functional GO categories. The 
vast majority of paralogous pairs have not migrated at all on the three GO 
hierarchies The retention rate is highest for GO molecular function cate- 
gories (92%) and somewhat lower for biological process and cellular com- 
ponent categories (85% and 81%, respectively). 

By contrast, paralogous pairs frequently migrate with respect to gene 
classes defined by shared regulatory mechanisms (such as genes that are 
targets of a transcription factor, or contain the same a'5-regulatory motif 
(Harbison et al, 2004) or RNA-binding motif (Gerber et al, 2004)). 
Indeed, in the majority of cases (70%) regulatory gene classes contain no 
retained paralogy relations within them, reflecting either novel regulation 
or regulatory specialization. The transcriptional modules exhibit an 
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intermediate behavior, with 26% of the paralogous gene pairs having 
migrated between modules, with a quarter of the paralog migrations occur- 
ring across the two main growth and stress groups. 

Our analysis shows that paralogs diversify most frequently at the level of 
regulation, less frequently through changes in cellular component and 
biological process and very rarely at the level of biochemical function. 
This highlights inherent limitations of gene duplication in accomplishing 
molecular innovation. It also emphasizes the overwhelming importance of 
regulatory divergence in reconfiguring molecular systems following dupli- 
cation (Carroll, 2008; Gu et al, 2004; Makova and Li, 2003). Changes in 
transcriptional regulation are clearly the foremost force that drives func- 
tional divergence after duplication. 



5.2. Estimating divergence based on degree of 
conserved interactions 

While functional categories provide extensive information, they are also 
crude, and hence cannot capture subtle divergence of two paralogs within 
the same general process. Functional characterization based on interaction 
networks addresses this limitation. In particular, annotation based on 
genetic interaction reflects the degree of shared biological process, whereas 
annotation based on biochemical (physical) interactions reflects the degree 
of shared molecular function. 

We employed two statistics to compute the degree of conserved inter- 
actions between pairs of paralogous proteins. The first was simply the 
fraction of shared interactions between both proteins. For this we counted 
the number of interactions each protein takes part in (al and a2 for proteins 
1 and 2, respectively) and the number of interactions they both share (s). 
The fraction of shared interactions is thus 

7 = . A ^ (i8-4) 

min(d, al) 

We also used the subfunctionalization index (I sf ) as previously described 
(He and Zhang, 2005) to characterize how diverged a pair of paralogs 
interactions are. This is calculated as 

1 — (s + al — al ) 
l* = ; (18.5) 

where s is as above and t is equal to (al -\- a2s). This statistic gives a 
reasonable estimate of the degree of subfunctionalization in the absence of 
neofunctionalization, since subfunctionalization would reduce the number 
of shared interactions. This measure considers the proportion of ancestral 
interactions that are no longer shared between the paralogs and the extent of 
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subfunctionalization for these interactions. Other methods for quantifying 
the degree of conservation between paralogous proteins have also been 
proposed by others and we refer to these publications for further reading 
(e.g., Musso et al, 2007). 

We find that the paralogous pairs can be partitioned into two distinct 
categories of roughly equal size. The pairs in the first category share a high 
and significant proportion of their interacting partners (136/318 pairs in the 
genetic network and 225/543 in the biochemical network). This is much 
higher than the proportion observed in comparable random networks, as 
estimated by comparing the degree of conserved interactions between the 
two paralogs in the real network to that in a degree-preserving randomized 
network (obtained by swapping edges between random pairs of nodes 10 
times). We repeated this procedure 10,000 times, and assigned and empiri- 
cal P-value to the shared protein interaction neighborhood of a pair of 
proteins according to the number of times the fraction of shared interacting 
partners between paralogs is equal to or greater than the fraction in the real 
network. These paralogs that exhibit such a high overlap between their 
interacting partners thus show little migration from their initial configura- 
tion immediately following their duplication. 

By contrast, the pairs in the second category share no interacting partners 
whatsoever, even though they themselves may frequently interact. Inter- 
mediate examples are rare (< 9% of the total). Notably, the second category 
includes many pairs of paralogous genes that belong to the same functional 
classes. In the biochemical network, these disjoint paralog pairs tend to be 
highly dispersed in the network (often at distance of four or greater) 
implying that they act in distinct physical environments. In the genetic 
network, however, these paralogs are often neighbors. This suggests that 
they have some overlapping function or can compensate for each other, 
despite the apparent divergence in roles. This pattern of diversification 
suggests a partial division of labor involving specialization (rather than 
adoption of a new function) of two paralogous proteins that become 
physically or temporally separated but can still have adverse consequences 
when both are compromised (Kafri et al, 2005). 




6. Discussion and General Applicability 

In this chapter, we presented a framework for identifying groups of 
orthologous genes in a step-wise manner, while simultaneously reconstruct- 
ing a corresponding phylogenetic gene tree for each group. We describe a 
novel algorithm — Synergy — that uses this framework to reconstruct a 
genome-wide catalog of gene trees across species by incorporating multiple 
sources of information, including sequence similarity and conserved gene 



Reconstructing Gene Histories in Ascomycota Fungi 481 

order when relevant. Synergy's gene trees reflect the evolutionary history of 
each group of genes, allowing us to accurately identify orthology and 
paralogy relations between genes, and the duplication, loss and divergence 
events that underly these relations. 

This approach has several important benefits. Its accurate automatic 
genome-wide resolution is unprecendented — it is typically absent from 
automatic "hit-clustering" methods applied on a genomic scale, which 
either ignore paralogs altogether (RBH) or do not make detailed distinc- 
tions between orthologs and paralogs (Tatusov et al, 1997). Since Synergy's 
gene tree reconstruction is constrained a priori by the topology of the species 
tree, we do not have to apply extra reconciliation steps (e.g., Durand et al, 
2006). For example, in orthogroups that have no duplication and loss 
events, our algorithm is guaranteed to yield the correct gene tree. 

Synergy strikes an important balance between orthogroup completeness 
(sensitivity) and soundness (specificity). We ensure completeness by allow- 
ing many edges (candidate homology relations) into the input gene similar- 
ity graph and by applying a lenient criterion to derive candidate 
orthogroups. Then, we achieve soundness by refining these coarse relations 
as we progress through the species tree, breaking orthogroups using phylo- 
genetic principles at each Stage. The bottom-up design of our algorithm 
also renders it scalable, allowing us to handle a large number of species and 
genes. 

While our bottom-up approach provides high-quality results, it is nev- 
ertheless a greedy algorithm and can occasionally mis-assign genes. This 
greediness could be relaxed by adding top-down re-assignments after the 
bottom-down reconstruction is completed. Formulating the orthology 
resolution problem within the framework of bottom-up orthogroup iden- 
tification should provide an important paradigm for additional algorithmic 
solutions. In addition, reconstructing the orthogroups' ancestral sequences 
at each Stage of the algorithm may allow it to deduce valuable information 
that can also be included in identifying orthologs more accurately. For 
instance, estimating the gene-specific mutation rates might also be consid- 
ered when determining orthology assignments. Not doing so might 
currently be hindering Synergy in cases where genes are particularly fast- 
or slow-evolving. These top-down and ancestral sequence reconstruction 
steps are both strategies that may yield improved results as well as intriguing 
insights about gene family evolution. 

Synergy opens the way to a host of comparative genomics studies. 
As comparative genomics gains widespread popularity, scores of groups of 
species have been extensively sequenced, all of which can be tackled with 
our scalable algorithm. We describe several methods for incorporating 
functional genomics data, including gene set annotations and interaction 
networks, to study the effects of and constraints on gene duplication and loss 
on molecular networks. This approach can be broadly applied to study the 
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evolutionary patterns within any group of organisms whose genomes have 
been sequences, allowing us to study how gene histories and functions can 
be closely interrelated. 
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Abstract 

Experimental evolution refers to a broad range of studies in which selection 
pressures are applied to populations. In some applications, particular traits are 
desired, while in others the subject of study is the mechanisms of evolution or 
the different modes of behavior between systems. This chapter will explore the 
range of studies falling under the experimental evolution umbrella, and their 
relative merits for different types of applications. Practical aspects of experimental 
evolution will also be discussed, including commercial suppliers, analysis meth- 
ods, and best laboratory practices. 




1. Introduction 

Experimental evolution is a generic term for laboratory selection 
experiments beyond those requiring simple one-step mutagenesis but per- 
haps more restricted in scale than the longer term pressures associated with 
domestication or geological timescale evolution. As our ability to analyze 
whole genome sequences improves via microarray and sequencing-based 
methods, we can expect more problems to become accessible through 
experimental selection approaches. 

In this chapter, I will cover the different types of selections typically 
performed under the guise of experimental evolution, citing a limited 
selection of example cases, and then move to the practical considerations 
involved in undertaking a subset of these. The many scientific contributions 
of experimental evolution in viral, microbial, and animal systems will not be 
covered in this chapter. However, the reader will find many excellent 
reviews that cover these systems (Adams, 2004; Buckling et ah, 2009; 
Burke and Rose, 2009; Elena and Lenski, 2003; Garland and Kelly, 2006; 
Philippe et al., 2007; Zeyl, 2006). 




2. Experiment Rationale 

Rationales for experimental evolution approaches in yeast are as 
numerous as the practitioners. Many research groups see the promise in 
explicitly testing many of the tenets of modern evolutionary theory. Experi- 
ments with sex and ploidy are exemplars of this approach, and have been 
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reviewed thoroughly elsewhere (Zeyl, 2004). Other evolutionary questions 
subjected to experimental testing include the role of mutation rates (e.g., 
Thompson et ah, 2006), mechanisms of assortative mating (Leu and Murray, 
2006), cooperation (Shou et ah, 2007), and clonal interference (Kao and 
Sherlock, 2008). 

In its simplest guise, such as selection experiments on drug resistance, 
laboratory populations can mimic the types of long-term adaptation that 
occur in chronic infections or cancer progression. Because the selection 
pressures can be carefully controlled under laboratory conditions, however, 
mutations can be more carefully assigned to a variable than by examining 
clinical samples. Acquisition of fluconazole resistance is a fine example from 
the Candida literature (e.g., Cowen et ah, 2000). These experiments serve as 
both a model for the development of drug resistance, and for the unraveling 
of the molecular mechanisms underlying resistance to particular drugs. 

Other types of experiments essentially extend this concept of the mutant 
or suppressor screen. With longer term, less severe selection pressures than 
with viability-based selection schemes, more subtle mutations, including 
combinations of such mutations, may be recovered. Although the yeast 
genome deletion collection provides an interesting set of mutants for the 
assay of phenotypes, the strains represent only null alleles. To make progress 
in further dissecting genetic pathways, the field may benefit from a return to 
the rare but interesting alleles generated by spontaneous mutation, particu- 
larly for essential genes. 

This style of experimentation can also shed light on larger questions of 
systems biology, such as evolution of gene expression, the relative merits of 
regulatory versus structural mutation, whether mutations affecting control 
points in a network are more wide-acting, and the mutability of different 
gene targets. Metabolic selection pressures have been particularly useful for 
studies along these lines (e.g., Ferea et ah, 1999; Francis and Hansche, 1972; 
Gresham et ah, 2008; Hansche et ah, 1978). 

Interesting questions about genome structure and organization can also 
be answered using the results of experimental evolutions. For example, 
point mutations, transposon insertions, and copy number variants have all 
been recovered from selection experiments (e.g., Blanc and Adams, 2003; 
Brown et ah, 1998; Dunham et ah, 2002; Gresham et ah, 2008). The types of 
effects generated by these different classes of genetic alterations, and the 
relative accessibility of different gene targets to each type of lesion, are still 
unexplored but accessible by these techniques. For example, copy number 
changes may be the most effective route by which to change gene expres- 
sion level, but the ability of a gene to change copy number may in turn be 
determined by the proximity to repetitive DNA segments that facilitate 
copy number change, and further complicated by the pleiotropic effects of 
additional neighboring genes on an amplicon. Point mutation, on the other 
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hand, represents a more surgical approach to perturbing gene function, but 
single point mutations may rarely provide large expression changes. 

Finally, experimental evolution can provide a facile technique to opti- 
mize or even create desired traits, for example in bioproducts, food, and 
beverage production (reviewed in Verstrepen et ah, 2006). Industrial yeast 
geneticists have long used this strategy successfully, often with strains of 
unknown genotype. Although recombinant DNA techniques in food- 
related industries such as wine and beer production are becoming more 
widespread (reviewed in Schuller and Casal, 2005), there is consumer 
reluctance to use such products. In such cases selection is frequently the 
only acceptable tool for improving extant yeast strains. 

For more industrial processes, introduction of recombinant DNA into a 
strain is less of a concern, but there is to date no clear recipe for the rational 
design of metabolic networks. Selection without prior knowledge of the 
mechanism can instead improve the performance of yeast strains involved in 
processes such as production of fuel ethanol and biotechnology products. 
Transfer of exogenous synthesis pathways into yeast, for example, could be 
followed by selection for more efficient integration into the yeast network, 
or higher product production. 

Applications to synthetic biology (one such approach is reviewed in 
Saito and Inoue, 2007) and synthetic ecology (e.g., Shou et al, 2007) will 
also put experimental evolution techniques in the forefront. Allowing the 
processes of mutation and selection to tune synthetic constructs may be the 
most efficient way both to create such circuits and to better understand what 
is required to achieve optimal performance. Clever selection schemes will 
no doubt be necessary to push these systems in the right direction. 




3. Experimental Evolution Approaches 

Laboratory evolution experiments fall into two broad categories: serial 
batch transfer and continuous culture. The most suitable approach depends on 
technical, practical, and scientific considerations, covered in the following 
sections. 

3.1. Serial dilution 

Serial dilution generally refers to selection preformed in the standard growth 
regimes typically used in the lab: flasks, test tubes, solid media, or 96-well 
plates. Cultures are usually allowed to grow through a normal growth 
curve, with daily transfer of a small volume of the expanded culture into 
fresh medium. Serial dilution has many advantages: the materials necessary 
are typically already present in the lab and require no special engineering. 
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Conditions can be adjusted as the experiment progresses (e.g., drug con- 
centrations increased as drug resistance improves). Selection pressures of a 
number of types can be accommodated. The easiest selections to understand 
are improvements to growth when maximal performance is attenuated 
either by exogenous or genetic means. In these cases, full growth curves 
may not be desired, as improved performance with respect to nutrient 
exhaustion or stationary phase may be separate outcomes unrelated to the 
main selection applied by the experimenter. 

Nutrient exhaustion is a popular selection scheme for batch transfer 
experiments, brought to prominence by experiments in bacteria by Lenski 
and colleagues (reviewed in Elena and Lenski, 2003) and adapted for yeast 
by Zeyl (2005). Here, one nutrient is lowered to the point that it uniquely 
runs out first and limits the saturated biomass of the culture. The relative 
amount of time cells spend in each phase of growth may change over the 
course of one of these experiments, particularly as lag phase shortens and 
maximal growth rate improves. 

Plate-based selection allows even more control over the transfer step, 
with visual identification of colonies. Either obviously larger or otherwise 
morphologically desired (e.g., Kuthan et ah, 2003) candidates can be serially 
inoculated to fresh plates, or, on the other end of the spectrum, as little 
selection as possible can be imposed by selecting random colonies. The 
latter approach has been used to generate mutation accumulation lines (Zeyl 
andDeVisser, 2001). 

3.2. Chemostats 

Chemostats have long been another favored platform for experimental 
evolution (reviewed in Dykhuizen and Hard, 1983), and were, in fact, 
invented for this application (Monod, 1950; Novick and Szilard, 1950a,b). 
A chemostat is a growth vessel into which fresh medium is delivered at a 
constant rate and cells and spent medium overflow at that same rate. Thus, 
the culture is forced to divide to keep up with the dilution, and the system 
exists in a steady state where inputs match outputs. The chemostat is 
attractive due to the enormous amount of control that is possible: growth 
rate, cell density, and selection pressure are all independently set. Because of 
these advantages, chemostats are also being used as tools for studying aspects 
of cell biology such as ammonium toxicity (Hess et ah, 2006), growth rate 
control (e.g., Brauer et ah, 2008), and comparative gene expression for 
mutants that would otherwise be difficult to compare due to profound 
growth rate differences (e.g., Hayes et ah, 2002; Torres et ah, 2007). 

Unlike serial transfer, chemostats require more specialized equipment, 
which can range from rather inexpensive (<USD$10,000) systems assem- 
bled from available parts to elaborate custom fermenters costing upward of 
USD$1 00,000. A table of suppliers and plans is provided (Table 19.1). 
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Table 19.1 Fermenter suppliers 



Maker 


Web site or citation 


Commercial 




ATR 


http://www.atrbiotech.com/ 


Infors HT 


http://www.infors-ht.com/ 


New Brunswick 


http://www.nbsc.com/ 


Dasgip 


http://www.dasgip.com/ 


Applikon 


http : // www. applikon-bio . com/ 


Sartorius 


http : // www. sartorius-stedim. com 


Homemade or custom-made 




Tom Gibson and Ted Cox 


Plans available in Gibson 1970 thesis, 




"The Fitness of an E. coli Mutator 




Gene," available through interlibrary 




loan from Princeton University 


Reeves Glass (custom-made 


http : // www. reevesglass . com/ 


glassware) 




Gavin Sherlock 


Plans available upon inquiry 


G. Finkenbeiner (custom-made 


http://fmkenbeiner.com/ 


glassware) 





In choosing equipment, several experiment-specific questions are 
important to consider, including volume/population size, number of paral- 
lel experiments required, space constraints, and measurement and control 
needs. The simplest chemostat experiment requires a media-feed, volume- 
metering device, and growth vessel with overflow. pH control, dissolved 
oxygen monitoring, real-time data feeds, and other features, may be 
required for more complex experiments. Large volume (> 1 1) fermenters, 
such as those available from New Brunswick, Applikon, and ATR, offer 
in-line probes for such measurements. Small vessels can be made from 
modified laboratory glassware or with the assistance of a glass blower. 
Glass-blown designs may include a glass frit for aeration and a water jacket 
for temperature control in addition to sampling and media flow ports. 



3.3. Turbidostats 

Another continuous culture system, the turbidostat, first introduced by 
Bryson and Szybalski (1952), combines some properties of serial dilution 
and chemostats. Instead of adding new medium at a constant rate, in a 
turbidostat, cell density is held constant. This is achieved by a feedback loop 
allowing adjustment of the nutrient addition rate in response to changes in 
density, usually measured via light transmittance. Few commercial options 
appear to exist currently, but turbidostats can be built using modern 
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microprocessor controlled peristaltic pumps. Designs using simple light- 
measurement devices can be found in textbooks (e.g., Norris and Ribbons, 
1970), and variations using LED and photodiode components would be 
straightforward extensions. 

The turbidostat provides selection on maximal growth rate while simul- 
taneously maintaining other conditions constant. The media composition 
defines the selection pressure as in other systems. Very little has been 
published on yeast grown in turbidostats, although that is likely to change 
given the benefits that this system provides. 



3.4. More specialized systems 

Other continuous culture systems have also been invented to control cell 
growth in various ways, via feedback at the level of pH or dissolved oxygen 
(generally known as auxostats, or, depending on implementation, accelero- 
stats, see Kasemets et ah, 2003), dielectric permittivity (the permittistat, 
Mark et ah, 1991), or carbon dioxide (e.g., Lane et ah, 1999). Undoubtedly, 
many other variations are possible. 



3.5. Miniaturization 

Both serial dilution and chemostat culture can be greatly miniaturized. For 
example, microfluidic chemostats have been reported by Groisman et ah 
(2005). This can be a huge advantage when large numbers of replicates or 
single-cell resolution are required. Volume reduction can greatly affect the 
population size, though, which can in turn change how evolution proceeds 
(see the following section for further discussion). For now, microscale 
chemostats might best be used as a phenotyping tool to better characterize 
clones isolated from larger chemostat experiments. 




4. Experimental Design 

There are a number of design considerations in planning any experi- 
mental evolution project. Several of the most important ones are covered in 
the following sections. 

4.1. Growth conditions 

The growth rate and selection pressures at which experimental evolutions 
are performed should be given careful thought. Selection upon maximal 
growth rate is preferred for many suppressor-type experiments, and may 
best be accomplished in serial dilution or turbidostat approaches because 
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chemostat cultures are difficult to operate near this value. To maintain a 
constant selection pressure, dilution should occur before the onset of 
growth limitation. In the event that entire growth curves are allowed 
each day (e.g., Zeyl, 2005), subpopulations with adaptations relevant to 
the different parts of the growth curve can be isolated. For example, strains 
with quicker resumption in lag, with faster maximal growth, or with 
additional ability to divide in stationary phase may all coexist in the culture. 
In such cases, the nutrient that runs out first is frequently termed limiting, 
but may be only one of several selection pressures. 

In chemostat cultures, the selection pressure is mainly defined by the 
limiting nutrient. Limiting nutrient should always be explicitly tested in the 
chemostat conditions but can be prototyped in batch cultures by measuring 
the saturation density of cultures grown with varying amounts of the limiting 
nutrient. Confirmatory experiments should always be done in the exact 
conditions under which the real experiments will be performed. Complex 
selection pressures may not be perfectly modeled by batch experiments; for 
example, ammonium toxicity is only apparent under limiting potassium in the 
chemostat (Hess et ah, 2006). Micro nutrients are another common culprit as 
hidden limitations (e.g., de Kock et al. , 2000) . A comprehensive test of limiting 
nutrient would include demonstrating that density varies linearly with the 
nutrient of interest and not at all with other additives. The exact profile of 
limiting concentrations for various nutrients is strain-dependent and should be 
tested explicitly when working in different backgrounds. 

The concentration of limiting nutrient is another key parameter since it 
determines cell density. Population size can greatly affect evolutionary 
parameters such as mutation supply, importance of drift, and time required 
for advantageous mutations to rise to detectable frequency (see further 
discussion in the following section). Also, density may affect gene expres- 
sion to some degree. Given recent findings on quorum sensing in yeast 
cultures, more work remains to be done to understand density-dependent 
effects. There are also practical considerations such as sample volume 
required for accurate measurements: dry weight yield, for example, may 
require more material than an expression microarray using an amplification 
procedure. For continuous cultures in particular, sampling too much vol- 
ume at once from the fermentor vessel can perturb the system, disrupting 
the steady state. When possible, passive sampling from the outflow is 
preferred, though this is not always possible, especially for time-sensitive 
applications such as expression measurements. 

4.2. Population size 

Beyond practical constraints, population size is a critical parameter for 
experimental evolution. In large populations, a modest adaptive mutation 
will take a long time to reach reasonable frequency in the population, and 
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clonal interference could be generating competitor clones simultaneously. 
In a small population, adaptation may be limited by the supply of beneficial 
mutations, and thus dominated by the highest frequency class. (See Desai 
et ah, 2007 for one treatment of these issues.) Per base mutation rate is on 
the order of 10~ or 10~ per site per generation, while per gene mutation 
rate to a null allele is closer to 10~ . Given these estimates, populations of 
different sizes will sample vastly different subsets of the mutation landscape. 

In serial dilution, two population size parameters must be determined: 
the saturation density and the bottleneck size. Severe bottlenecks may 
eliminate the vast majority of small-to-intermediate fitness variants that 
have not had time to reach appreciable frequency in a single day growth 
curve. In practice, the bottleneck population size is typically on the order of 
10 6 -10 7 cells. 

Chemostat populations may be much larger, 10 cells or more. Even in 
chemostat cultures, the initial phases may be dominated by variation gener- 
ated as the culture grows from a single cell to the final population size. In this 
regime, the stochastic factor of when a mutation occurs can affect the allele 
frequency. For example, mutations of ~ 10% fitness advantage that become 
detectable starting around 100 generations of growth were hypothesized to 
fit this model (Gresham et ah, 2008). 



4.3. Experiment duration 

The number of generations or length of time to allow a culture to evolve is 
both a practical and theoretical matter. In some cases, a particular desired 
outcome may be reached. In others, the end of an experiment may be 
governed by unfortunate circumstances such as contamination, user error, 
or infrastructure breakdown. Steps can be taken to prevent some of these 
events, such as careful attention to sterile practices. Backup power and 
aeration systems can also be implemented if necessary. Clumping of the 
culture is another less catastrophic, though perhaps less avoidable, endpoint 
(see below). Practical considerations concerning the number of manipula- 
tions and the necessity for lab worker attention past work hours may also 
limit experiment length. 

Barring errors, however, experiments can run for days to decades. 
Whether experiments really require such long timescales is purely a scien- 
tific question. Early events in the chemostat may determine to a large extent 
what direction the population takes. Subsequent events may in fact be 
modifying mutations that optimize early events as opposed to "primary" 
events that may be more interesting. For example, where genome rear- 
rangement is operative, amplicon size may shrink to contain relatively more 
causative genes and fewer copy number sensitive genes, or second-site 
suppressors of these sensitivities may arise. 



496 Maitreya J. Dunham 

Restarting interesting evolutions is also always an option, though there 
are likely added complications from loss of population complexity, plus 
added selection constraints on freeze tolerance and outgrowth from the 
stock. Evolved clones could alternatively be used as new founders. 

s 5. Practical Considerations 

5.1. Strains and markers 

The strain used for experimental evolutions should be considered very 
carefully. Auxotrophies should be avoided for any metabolic selection, 
since selection would be strong for harnessing supplements as nutrient 
sources. Also, the network biology may be rather different in metabolically 
blocked strains. Marker genes can also cause fitness differences. Baganz et ah 
(1997) explicitly tested the fitness consequences of drug and nutritional 
markers in chemostat competition and found variation by selection pressure 
and marker used. In general, drug resistance markers were neutral while 
nutritional markers frequently caused fitness costs. These results are likely to 
depend strongly on the particulars of the growth regime and should be 
explicitly tested in novel environments. 

Strain genotype beyond engineered genetic markers should also be 
considered. The classic S288C strain background, though a workhorse for 
decades of yeast genetics and biochemistry, has a number of probably lab- 
selected traits, including Ty insertions in HAP1 and CTR3, increased petite 
frequency, and abnormal nitrogen source preferences. Almost all the lab 
strain alternatives (e.g., W303, sigma 1278b, and CEN.PK) share a large 
proportion of their genomes with this strain, though the exact alleles carried 
vary. These other backgrounds may also carry additional mutations, such as 
an adenylate cyclase (CYR1) mutation in CEN.PK. All lab strains are likely 
to contain some signatures of their domestication. 

In addition, new mutations are generated spontaneously during lab 
cultivation and strain construction, and may result in hidden problems. 
One cautionary example can be found in a series of glucose-limited chemo- 
stat evolutions (Ferea et ah, 1999). During creation of the prototrophic 
ancestor strain, a loss of function mutation occurred in the gene AEP3, 
which stabilizes the RNA coding for subunits of ATP synthetase in the 
mitochondria. Not surprisingly, this mutation was detrimental in glucose- 
limited cultures, and reversion of this mutation was later found in evolved 
strains from two independent experiments (Brauer et ah, 2006; Dunham 
et ah, 2002; Gresham et ah, 2006). 

Although such problems cannot be completely avoided, they can be 
mitigated by pairing compatibility of a particular strain with a particular 
selection pressure. For example, as long as a particular mutation is not 
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limiting, it may not be the target of beneficial mutation, van Dijken et ah 
(2000) undertook a comparison between strain backgrounds and found 
variation for all parameters tested, leaving no single strain with the "best" 
array of desired characteristics. 

Flocculant growth is another strain feature that can present problems. 
In many experimental evolution regimes biofilm formation and clumping 
provide an advantage unrelated to the selection pressure of interest. For 
example, groups of cells may sink to the bottom of a fermentor, or biofilms 
may form on any surfaces, allowing subpopulations to avoid being diluted 
out of the culture. Besides complicating measurements of cell density, these 
subpopulations can contribute a constant supply of minority genotypes and 
interfere with population genetic measurements. Also, since many evolu- 
tion experiments are designed around evaluating particular selection pres- 
sures, generic responses to the growth apparatus can be a confounding 
result. 

For serial dilution-type experiments, some of this effect can be elimi- 
nated by transferring to a new vessel at each dilution rather than pouring off 
excess culture and adding new medium to the original vessel. In chemostats, 
transferring vessels is a riskier process, but can still be accomplished with 
care. Using strains with genotypes that limit their flocculation potential is 
another approach. Most lab strains carry at least one such mutation, and 
engineered FLO gene deletions (e.g.,flo8) will not revert. Operation at high 
cell densities may further aggravate this phenotype, though much of this 
data is anecdotal. In practice, severe flocculation typically ends an experi- 
ment. Experiments can be prolonged by briefly sonicating culture samples 
before analysis. 

Strains resulting from experimental evolution may also have a number of 
characters limiting their further use. Selection upon purely mitotic growth 
may relax selection on the rest of the yeast life cycle. Aneuploids, for 
example, may have trouble sporulating or segregate lethality resulting 
from the heterozygous deletion of essential genes. The mating pathway 
may also be abrogated as a means of conserving cellular resources (Lang 
et ah, 2009). Since most samples are archived through cryopreservation, 
freeze tolerance may be another hidden variable affecting later analysis. In 
addition, some cultures evolved in poor nutrient conditions may actually 
show growth deficiencies on rich media, though the reverse may also be 
true for the purposes of recovery from frozen stocks. 

5.2. Media 

Media requirements will depend on the desired selection pressure, but must 
be consistent no matter what the application. For this reason, rich media 
made from coarse or technical grade ingredients is not recommended due to 
batch variation. High-quality chemicals and water are required, especially 
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for nutrient-limited cultures, since trace contamination may provide a 
nontrivial amount of the nominal limiting nutrient. When possible, direct 
measurement of the limiting nutrient in media samples is recommended. 
Very sensitive spectrophotometric, enzyme-based assays are available com- 
mercially for many carbon sources. Phosphate can also be reliably measured 
by colorimetric assay. Other chemical analysis techniques such as induc- 
tively coupled plasma and mass spectrometry can more generally measure 
the elemental or metabolite profile of a sample, though sensitivity should 
generally be tested explicitly. 

Preparation of media for experimental evolution requires more care than 
most microbiology experiments. Even small measurement errors can have 
profound effects on culture density. All materials used for media preparation 
should be rinsed thoroughly to prevent cross contamination. Variation in 
volume levels due to evaporation may be introduced by unanticipated 
differences in autoclave pressure, temperature, or timing and these may 
dramatically affect the outcome of the experiment. Filter sterilization of 
media into autoclaved carboys is one way of mitigating this effect. Some 
media components may be light or temperature labile. Also, in large 
volumes, viscous additives such as glucose may settle to create a gradient 
in the media vessel. Extra effort may be required to ensure that all such 
additives are thoroughly dissolved. 

The growth apparatus itself may also leech chemicals into the medium. 
Metal fittings are one example supported by anecdotal evidence. Low- 
reactivity plastics and glass are typically a better choice for experiments 
that would be sensitive to such fluctuations. Plastics may pose their own 
problems if paired with incompatible solvents. Drug solutions requiring 
such reagents require particular care. 



5.3. Growth rate 

Growth rate is explicitly set by the experimenter for chemostat cultures, 
but the allowable range is dependent on media, temperature, and strain 
background, and growth rate differences introduce differences in many 
parameters. One important example is the ratio of respiration to fermenta- 
tion in glucose-limited cultures grown at different rates. At growth rates 
below a strain-specific critical growth rate parameter, respiration dominates, 
but above this threshold, fermentation predominates. Evolutions performed 
very close to this boundary may shift thresholds as a mechanism of increas- 
ing efficiency. In chemostat and batch conditions, growth rate correlates 
strongly with a large gene expression pattern that overlaps that of the 
environmental stress response (Brauer et ah, 2008). Incorrect dilution set- 
tings can thus easily lead to spurious gene expression differences. A working 
rule of thumb is to keep settings within 10% of the target growth rate. 
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5.4. Good sterile practices 

With some experiments running for years, contamination is a threat to 
experiment integrity. Contamination can be introduced during any break 
in continuity, most commonly during changing media supplies, or sampling 
from inside the vessel. Sampling should be done passively from the overflow 
if possible, though contaminants may also grow in exposed tubing. Periodic 
changes of this tubing may be required, particularly if drug-resistant con- 
taminants interfere with detection of low-frequency variants from the main 
culture. This design also helps to prevent retrograde colonization of the 
main culture with contaminants from the effluent. Positive pressure 
provided by vigorous aeration is also recommended to limit contamination 
opportunities. 

When changing media carboys, leakage should be avoided as much as 
possible. Droplets of media left around connectors provide a rich growth 
opportunity for microbes, and a risk of transfer to the inside of the tube 
during the next carboy transfer. Self-closing connectors are one preventa- 
tive, though not entirely fail safe. Ethanol or bleach can remove most 
material, though again, such treatment is not fail safe. 

Visual inspection of culture under the microscope or of colonies can detect 
high-frequency contaminants with morphological differences, such as bacteria 
or filamentous fungi. Experimental evolution frequently leads to morpholog- 
ical changes, so this is only appropriate for obviously different species. Also, 
some contaminants may not grow on solid medium. Checking that strain 
markers are constant over a time course can provide experimental assurance 
that other strains or species have not invaded. Strains with drug markers are 
more amenable to this test, but PCR-based markers can also be developed to 
differentiate between common strains. Obviously, if the contaminant is of the 
same strain background, problems will be more difficult to detect. 

Another type of contamination is from the growth chamber into the 
media supply. Aerosolized yeast droplets can be pushed up the tubing if air 
pressure is forced through. Also, variants with improved flocculation capac- 
ity can lodge in crevasses at junction points. Clear tubing is recommended 
so that such colonization can be detected and problem spots eliminated. 
In extreme cases, the media input port may need to be heated to kill any 
back-contaminants. Wrappable heated tape is one common approach. 



5.5. Good strain hygiene 

Because mutations that arise during experimental evolution are generally 
not cloned using a functional assay, and because multiple mutations are 
typically present after sufficiently long timescales, strains should undergo 
limited passaging before introduction into the growth vessel to limit muta- 
tion accumulation. Even minor handling of strains can introduce lesions 
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(see AEP3 example above). Preservation of a time zero sample of the 
population for comparison is important to eliminate these possibilities 
when performing post hoc analysis. Exact records of strain stock of origin 
for each population are also recommended, since ambiguities introduce 
uncertainly later in analysis. 

5.6. Record-keeping 

Because experimental evolutions may run for long periods of time, and be 
reanalyzed by many people within and between labs, good record-keeping 
practices are essential. An example system uses index cards for recording 
daily data, plus a digital copy of these records for analysis and archiving. 
A master database in Filemaker or some other software package can assist in 
keeping track of many experiments and their related data files, which should 
be backed up at regular intervals. Parameters such as strain background, 
media formulations, growth conditions, and other important details should 
be recorded. To identify potential problems with media composition, 
addition of new media supplies should be tracked. 

Freezer stocks must be maintained in a very ordered way, particularly 
when lab personnel turn over. Systematic naming conventions are essential 
for long-term continuity. Freezer maintenance is also crucial to the long- 
term viability of evolved cultures. If possible, duplicates of glycerol stocks 
may be stored off site for backup purposes. Complex population samples are 
impossible to perfectly duplicate, so preplanning is required for this 
approach. 

Sharing populations with other labs is another problem when unique 
samples are involved. Dense lawns or patches, or large numbers of isolated 
colonies, can be scraped from plates and frozen in glycerol culture to 
attempt to maintain population frequencies. 




6. Analysis Techniques 

6.1. Sampling regimen 

Obviously the details of what to measure day-to-day will depend heavily on 
what experiment is being performed. Parameters that may be recorded 
include cell density as surveyed by Klett colorimeter, spectrophotometer, 
or cell count; viable cell count on rich- or low-nutrient plates; notes on cell 
morphology, colony morphology, and flocculation status; and even changes 
in aroma. Frozen stocks may be collected daily. Small samples for processing 
into RNA and DNA may also be collected at intervals. If these samples are 
collected on a filter, a filtrate sample is generated simultaneously that can be 
assayed for residual nutrient or metabolite levels. 
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6.2. Population genetics 

Population-scale expression and CGH measurements generated from these 
samples can be useful for interrogating the frequencies of copy number 
changes, though with the caveat that frequency changes and extra copy 
number changes in a subpopulation may look identical. Allele frequency 
measurements can also be made via quantitative PCR (Kao and Sherlock, 
2008) or quantitative sequencing (Gresham et ah, 2008). Mutations of a 
variety of types can be measured via microarray (reviewed in Gresham et ah, 
2008), though next generation sequencing is sure to contribute in the near 
future. With strong enough phenotypes, mutations can also be linkage- 
mapped using classical approaches or by bulk segregant analysis (Brauer 
et ah, 2006; Segre et ah, 2006). 

Phenotype characterization methods can also vary. By definition, only 
phenotypes present in the selective conditions are relevant to the experiment 
at hand. However, particularly in the chemostat, it may be impossible to 
survey large numbers of samples for fitness or growth parameters. Bulk 
competition experiments provide one solution (e.g., Gresham et ah, 2008). 
If individual genotypes can be marked, or detected directly by sequencing, 
their frequency over time can be used to calculate their fitness. Assays on 
plates or nonselective media can also be used as a screen to narrow down the 
search space that needs to be covered in a more tedious growth state. 
However, plate phenotypes do not always behave as expected in the milieu 
of the population. 



6.3. Fitness 

Fitness is generally the most relevant phenotype in an experimental evolu- 
tion, and can conveniently be assayed by direct competition experiments. 
Fitness measurements in particular should be optimally performed not just 
in the conditions under which the strain evolved, but even in the exact 
population context since fitness may be highly dependent on the competi- 
tors (Paquin and Adams, 1983a,b). This condition is almost impossible to 
recreate. In practice, fitness is generally assayed by direct competition with 
the ancestral strain or with other evolved clones. One or both strains may be 
marked to facilitate frequency measurements, or the relative frequency can 
be sampled by following mutations via quantitative sequencing or some 
other means. 

Strain tagging with drug resistance markers is the most common way of 
performing mixed fitness assays (e.g., Gresham et ah, 2008; Paquin and 
Adams, 1983a,b). Fluorescence markers have also been used successfully 
(e.g., Kao and Sherlock, 2008; Thompson et ah, 2006), and are attractive 
because of both their ease of use and improved accuracy. While only 
hundreds to thousands of colonies can be easily assayed for drug resistance, 
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orders of magnitude more cells can be assayed by FACS. For both methods, 
accuracy improves as sampling density increases. One disadvantage of 
fluorescent markers is that expression of these proteins may impose a 
selective cost, which must be assayed in controls and subtracted out from 
all further measurements. Whether this cost is constant across conditions 
and strain backgrounds must be tested in each situation. 




7. Example Protocol 

Working from these general recommendations, this section describes an 
example glucose-limited chemostat evolution experiment. Related detailed 
protocols with photographs and recipes are available at http://dunham.gs. 
Washington, edu/ . 



7.1. Medium formulation 

Chemostat glucose-limited synthetic minimal media contains (per liter) 
0.1 g calcium chloride, 0.1 g sodium chloride, 0.5 g magnesium sulfate, 
1 g potassium phosphate monobasic, 5 g ammonium sulfate, 500 fig boric 
acid, 40 fig copper sulfate, 100 fig potassium iodide, 200 fig ferric chloride, 
400 fig manganese sulfate, 200 fig sodium molybdate, 400 fig zinc sulfate, 
1 fig biotin, 200 fig calcium pantothenate, 1 fig folic acid, 1 mg inositol, 
200 fig niacin, 100 fig p-aminobenzoic acid, 200 fig pyridoxine, 100 fig 
riboflavin, 200 fig thiamine, and 0.08% glucose. 

Medium is prepared in 10 1 quantities, mixed thoroughly, and filter 
sterilized into an autoclaved glass carboy. Carboy has an outlet port at 
bottom, leading to a small piece of tubing with a luer lock connector at 
the end. All entry and exit ports are covered with foil before autoclaving. 
Outflow tubing is sealed with a metal clamp before filling. Carboy is placed 
on a shelf above chemostat area. 



7.2. Chemostat preparation 

A glass-blown chemostat apparatus (Reeves Glass) is outfitted with input 
tubing including an in-line segment of peristaltic pump tubing and an 
appropriate luer fitting for connection to the media supply. The overflow 
port is connected to tubing leading through a bored cork to an effluent 
collection bottle which drains into a larger reservoir. All free tubing ends are 
foil-wrapped, and the entire assembly is placed in a tray and autoclaved. 
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7.3. Chemostat assembly 

Autoclaved elements are assembled using sterile technique. Airflow is 
provided by an aquarium pump, via a water diffuser for humidification, 
and sterilized by two in-line autoclaved filters. Temperature control at 
30 °C is provided by a circulating waterbath attachment to the water jacket 
of the chemostat vessel. 

Once assembled, carboy outflow is connected to chemostat inflow and the 
tubing is undamped to allow chemostat to fill by gravity flow. When chemo- 
stat begins to overflow (working volume ~200 ml), flow is clamped off via 
loading of the pump tubing into the peristaltic pump head. The pump has been 
precalibrated to supply a dilution rate of 0.17 chemostat volumes per hour. 

7.4. Inoculation 

The strain FY4, a prototroph haploid of the S288C background, is streaked for 
single colonies from a glycerol stock to a YPD plate and grown at 30 °C for 
2 days. A single colony is inoculated into 2.5 ml of glucose-limited chemostat 
medium and grown overnight at 30 °C. One milliliter of the culture is used to 
inoculate the chemostat and 1 ml is frozen in glycerol stock as the time sample. 
Chemostat is grown to saturation overnight and then the pump is started. 

7.5. Daily sampling 

Daily, the effluent volume is measured and any necessary modification is 
made to the pump settings. The cork is removed from the effluent bottle 
and placed in a small tube to passively collect 10 ml culture. One milliliter is 
frozen in glycerol stock. The sample is measured for A 600 and Klett density, 
then briefly sonicated for cell counting in a hemacytometer. Diluted sam- 
ples are plated to YPD and minimal media agar plates. Notes are also 
recorded about cell morphology, colony morphology, and chemostat vessel 
observations (e.g., wall growth, aroma). Carboy supply is monitored and 
new carboys of sterile media are supplied as necessary. A 10 ml sample from 
each retired bottle is collected and frozen for analysis of media composition. 
In the first 2—3 days after inoculation, the culture has not yet reached 
steady state. Steady state is usually defined operationally as occurring once all 
measurements have been equal for 2 days in a row. This should occur at 
approximately generation 10—15. 

7.6. Weekly sampling 

Once or twice a week, 25 ml is passively collected from the effluent port, 
pelleted, and resuspended in glycerol stock for later DNA preparation. Ten 
milliliters of culture is removed from the main vessel via a port in the 
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chemostat lid using a sterile pipette. This sample is collected on a filter and 
snap-frozen for later RNA preparation. Filtrate from the RNA collection is 
frozen for later metabolite and residual nutrient analysis. 

Sampling continues until an error occurs, or the culture develops a 
clumping phenotype, as defined by clumps that cannot be broken up by 
light sonication and are observed in most microscope fields. 

7.7. Analysis 

Data from the experiment are recorded in the database and the raw data 
index cards are filed in the master system. 

Collected samples are processed for DNA and RNA and assayed via 
microarrays and/or Solexa sequencing for genotype and phenotype differ- 
ences. Collected media and filtrate samples are analyzed for limiting nutrient 
concentrations to ensure constant nutrient source and to detect increased 
consumption. 

Representative clones are isolated from population glycerol stocks and 
assayed for growth phenotypes and mutations. Clones are regrown in new 
chemostats just until reaching steady state, and then harvested for expression 
analysis versus the ancestral strain grown in the same conditions. 

Once mutations are discovered, their gross frequency can be retrospec- 
tively assayed by performing PCR directly on small samples of cells obtained 
from the population glycerol stocks. The mixed PCR product is sequenced 
to determine the relative amount of each allele. Time samples are included 
to ensure mutation was not already present in the inoculum. 

Clones may also be subjected to competition versus a marked wild-type 
ancestor strain. In this case, both strains are grown to steady state in 
individual chemostats and then mixed. In the null exception, 50% of each 
strain should be present, but this often gives insufficient survey time for 
evolved strains with 5—50% fitness increases. When the strain is known or 
suspected to carry such an advantage, more useful data can be collected 
using a starting frequency of 5— 10%. 




8. Conclusions 

Experimental testing of evolutionary questions is almost as old as evo- 
lutionary theory itself. The use of these techniques in yeast is yielding exciting 
results in evolutionary genomics, systems biology, and theory, complement- 
ing the excellent comparative genomic and ecological tools also maturing in 
yeast concurrently. The use of evolution as a tool will also help tune synthetic 
systems and generate new and useful strains and constructs. This guide is only 
to be taken as touching on the highlights of this exciting field, and, hopefully, 
lowering the bar to entry for new researchers. 
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Abstract 

As Saccharomyces cerevisiae is engineered further as a microbial factory for 
industrially relevant but potentially cytotoxic molecules such as ethanol, issues 
of cell viability arise that threaten to place a biological limit on output capacity 
and/or the use of less refined production conditions. Evidence suggests that 
one naturally evolved mode of survival in deleterious environments involves the 
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complex, multigenic interplay between disparate stress response and homeo- 
stasis mechanisms. Rational engineering of such resistance would require a 
systems-level understanding of cellular behavior that is, in general, not yet 
available. To circumvent this limitation, we have developed a phenotype discov- 
ery approach termed global transcription machinery engineering (gTME) that 
allows for the generation and selection of nonphysiological traits. We alter gene 
expression on a genome-wide scale by selecting for dominant mutations in a 
randomly mutagenized general transcription factor. The gene encoding the 
mutated transcription factor resides on a plasmid in a strain carrying the 
unaltered chromosomal allele. Thus, although the dominant mutations may 
destroy the essential function of the plasmid-borne variant, alteration of the 
transcriptome with minimal perturbation to normal cellular processes is possi- 
ble via the presence of the native genomic allele. Achieving a phenotype of 
interest involves the construction and diversity evaluation of yeast libraries 
harboring random sequence variants of a chosen transcription factor and the 
subsequent selection and validation of mutant strains. We describe the ratio- 
nale and procedures associated with each step in the context of generating 
strains possessing enhanced ethanol tolerance. 




1. Introduction 

In addition to its central position in the histories of baking and 
brewing, Saccharomyces cerevisiae has played a prominent role in biotechno- 
logy as a production platform for numerous therapeutics and industrial small 
molecules. Typically, strain engineering has involved manipulating bio- 
chemical fluxes in a rational manner for the maximal accumulation of 
either commercially relevant native metabolites or products synthesized 
by heterologously expressed genes (Nevoigt, 2008). However, as the bud- 
ding yeast is modified further to expand its output capacity or product 
spectrum (Hadiji- Abbes et ah, 2009; Rao et ah, 2008; Waks and Silver, 
2009), the likelihood of encountering conditions that prove stressful to 
cytotoxic increases greatly. The ability to withstand such deleterious cir- 
cumstances is likely to be a complex phenotype impinging on multiple stress 
response mechanisms simultaneously. For example, physiological tolerance 
to ethanol — a natural product of glycolytic fermentation — is thought to be 
imparted through the coordinated action of proteins involved in functions 
as diverse as vacuolar maintenance, phosphatidyl inositol metabolism, and 
histone acetylation (van Voorst et ah, 2006). A rational approach to 
engineering tolerance would thus likely necessitate the development of a 
comprehensive genome-scale model of homeostasis, and the directed, con- 
current manipulation of multiple pathways. Unfortunately, a quantitative 
wiring diagram of such breadth does not currently exist. 
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In the absence of such a detailed, systems-level understanding of cellular 
control, we have developed a mutagenesis and selection approach termed 
global transcription machinery engineering (gTME) that generates non- 
physiological traits through the alteration of transcription on a genome- 
wide scale. In contrast to traditional protein engineering where individual 
enzyme or binding partner variants are screened for modified activity 
in vitro, gTME involves the introduction of a randomly mutagenized, 
plasmid-borne copy of a general (i.e., gene nonspecific) transcription factor 
into yeast and the selection for cellular phenotypes created in vivo. Novel 
phenotypes are thus produced through the differential regulation of tran- 
scriptional programs elicited by the presence of the mutated allele. Indeed, 
a reconstruction of evolutionary divergence across a large group of yeast 
species has shown that much natural phenotypic innovation emerges not 
from changes to protein function, but rather from gene duplication and 
subsequent changes to transcriptional regulation (Wapinski et ah, 2007). 
Likewise, gTME involves two copies of a target transcriptional regulator: a 
mutagenized variant on a plasmid and the intact, native allele on the 
chromosome. Thus, the isolation of desired phenotypes arising from altered 
function of even essential genes is possible as the presence of the chromo- 
somal allele continues to provide that function. 

Although published later, perturbation of the transcriptome via modifica- 
tion of a global transcription factor was originally conceived in Escherichia 
coli as an engineering strategy to unlock multigenic phenotypes (Alper and 
Stephanopoulos, 2007). Variants of the rpoD gene were shown to confer 
enhanced lycopene production and combined ethanol and sodium dodecyl 
sulfate (SDS) tolerance in bacteria. Subsequently, the technique was adapted 
for use in S. cerevisiae and demonstrated to be successful in imparting elevated 
ethanol resistance and rates of fermentation (Alper et al, 2006). This chapter 
thus details the various steps associated with implementing gTME in the 
budding yeast using the goal of enhanced ethanol tolerance as a template. In 
brief, the overall approach can be organized into five major steps: (1) identifi- 
cation of a relevant transcription factor, (2) selection, design, and construction 
of expression constructs for mutant plasmid libraries, (3) creation of yeast 
libraries and evaluation of phenotypic diversity, (4) selection for phenotypes 
of interest, and (5) validation of mutated transcription factor alleles. 




2. Transcription Factor Selection 

The transcription machinery component(s) chosen for mutagenesis 
will perhaps have the largest influence on the likelihood of obtaining the 
desired phenotype. Depending on the biology thought to underlie that 
phenotype, the choice may be obvious or speculation. Since it is a decision 
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that may ultimately call for some informed guesswork, we furnish several 
examples here in an attempt to assist in the selection process. 

Knowledge of the processes that could potentially contribute to the 
desired phenotype should provide the best hint of the transcriptional com- 
plexity and relevant players involved. On the simpler end of the spectrum, 
for example, a few cellular functions can be reduced to gene expression 
programs dominated by a handful of regulators. If the desired phenotype is 
enhanced galactose metabolism, the GAL3, GAL4 activators, and/or the 
GAL80 repressor would be clear targets (Lohr et ah, 1995); for enhanced 
phosphate utilization, obvious choices would be the PH02 and/or PH04 
activators (Ogawa et ah, 2000). In dissecting more complex cellular traits, 
even what appears to be sophisticated behavior can often derive from the 
combinatorial interplay of a small number of factors. For example, if the 
goal is greater tolerance to osmotic stress — a response that integrates multi- 
ple signals and modulates the expression of hundreds of genes — the SKOl, 
HOT1, and/or MSN2/4 transcription factors would serve as appropriate 
candidates (Capaldi et ah, 2008). In our efforts to improve ethanol tolerance, 
proteins with widely disparate functions were known to be involved yet no 
transcriptional pathways presented themselves as obvious targets. Thus, 
to maximize the number of transcripts perturbed, SPT15 (the S. cerevisiae 
homolog of the TATA-box binding protein) was chosen for its putative 
ability to control as much as 90% of the yeast transcriptome by its association 
with the general transcription complex TFIID (Huisinga and Pugh, 2004; 
Kim and Iyer, 2004). 

In general, the more nonspecific and globally acting the transcription 
component, the higher the number of genes that will be affected and, in 
principle, the wider the sector of adaptation space that will be explored. 
However, readers are encouraged to evaluate transcription factors more 
specific to their pathways of interest to increase the probability of sampling 
advantageous phenotypes. In addition to surveying the literature and the 
numerous internet-based resources available (e.g., Saccharomyces Genome 
Database, MIPS Comprehensive Yeast Genome Database), candidates may 
also be identified through consideration of mechanisms underlying similar 
traits in other organisms. Nevertheless, if no clear targets exist, one of the 
many subunits of TFIID may serve as a suitable starting point. 




3. Plasmid Library Construction 



3.1. Promoter selection 



The manner in which the plasmid-borne, mutated allele is expressed is also 
an important variable having a large influence on the likelihood of success. 
A natural choice is to drive the mutagenized transcription factor by its 
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endogenous promoter. This strategy may be appropriate for applications 
where native regulation of the gene is desired; indeed, numerous enhance- 
ments were generated in E. coli from variants o£rpoD coupled to its upstream 
intergenic sequence (Alper and Stephanopoulos, 2007). However, preserving 
the physiological control of a transcription factor creates the possibility of its 
repression in the potentially nonphysiological conditions of the selection 
process. Thus, the use of a constitutive promoter is recommended, at least 
until further constraints or data in a specific application call for an alternative. 

Although constitutive expression may avert possible downregulation of the 
plasmid-borne allele, promoter strength is a variable that must be optimized. 
Low transcription levels could result in unaltered phenotypes, while transcrip- 
tion rates that are too high risk out-competition of the native transcription 
factor and possible cytotoxicity. The ratio of mutated to wild-type proteins that 
would result in the maximal number of beneficial traits is believed to be 
influenced by many factors including the expression level of the chromosomal 
allele, redundantly acting paralogs, and the ploidy and genetic background of 
the host strain. Consequently, an optimal expression level must be determined 
empirically. A reasonable approach is to titrate promoter strength on a plasmid 
copy of the wild-type transcription factor and use growth as a quantifiable 
proxy for the phenotype of interest (a proxy particularly applicable to tolerance 
phenotypes). This is supported by preliminary data showing that survivability 
trends (in stress and nonstress conditions alike) are somewhat correlated 
between strains carrying a gTME allele and strains carrying an additional native 
allele (Alper et ah, unpublished). 

In principle, any series of yeast promoters with varying constitutive 
strengths may be used. For example, the p413— p426 family of shuttle expres- 
sion vectors offers a ~ 1000-fold range of transcription rates through low 
(CEN/ARS) and high (2 fi) copy number plasmids containing either a 
CYC1, TEF1 (referred to originally as TEF2, from before the availability of 
the complete S. cerevisiae genome sequence), ADH1, or TDH3 (often referred 
to as GPD) promoter (Funk et ah, 2002; Mumberg et ah, 1995). It is worth 
noting that the "constitutive" ADH1 promoter is, in fact, repressible on 
nonglucose medium (Denis et ah, 1983), and that select versions of these 
plasmids carrying the kanamycin dominant drug resistant gene have been 
developed for use with prototrophic yeast strains (Dualsy stems Biotech AG, 
Switzerland). Alternatively, one could opt for a promoter series derived from a 
common sequence background offering finer grain increments in transcription 
rate. For example, a collection of 11 TEF1 promoter mutants based on 
p416TEF was developed that features expression levels between ~8% and 
120% of the wild type (Alper et ah, 2005; Nevoigt et ah, 2006). Furthermore, 
the TEF1 promoters offer constitutive behavior is preserved on both glucose 
and nonfermentable carbon sources. 

After selecting a promoter family, insert the coding sequence of the 
wild-type transcription factor in front of the yeast promoter in each member 
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of the series using established DNA cloning and manipulation techniques 
(Ausubel, 2001). Typically, these yeast expression vectors have multiple 
cloning sites flanked by the promoter and a common transcriptional termi- 
nator (e.g., CYC1). Thus, one generally does not need to be concerned 
with any endogenous untranslated regulatory sequences and can directly 
clone in the open reading frame of interest (e.g., using polymerase chain 
reaction (PCR) and primers incorporating the appropriate restriction sites). 
To minimize the possibility of DNA sequence variation across various 
S. cerevisiae strain backgrounds, genomic DNA prepared from the host 
strain should be used as the source of the coding sequence. Additionally, 
as a functional control for expression, it is recommended that an identical 
plasmid series containing a transcriptional reporter as the insert (e.g., yeast- 
enhanced green fluorescent protein) be obtained or constructed at the same 
time so that promoter strengths can be verified independently by fluores- 
cence or another method as appropriate (Cormack et ah, 1997). 

After preparation and sequence validation of the plasmid family, assess 
the in vivo impact of the additional transcription factor copy as follows. 
Using standard DNA uptake and strain selection techniques (Ausubel, 
2001), transform the plasmids into yeast to create a family of strains where 
each member contains a unique construct from the expression series. From 
individual colonies, create starter cultures by inoculating each transformant 
into laboratory standard liquid selection medium and growing cells to 
saturation. Prepare liquid medium containing the condition of interest; if 
such a condition is toxic, determine a dilution that is amenable to growth 
beforehand. Since it is the relative performance between the differentially 
promoted transcription factors that is important, any dilution allowing 
growth will suffice. Using established cell density measurement methods, 
inoculate the liquid medium of interest to a common density from the 
saturated cultures and follow the increase in cell number until stationary 
phase (Ausubel, 2001). Determine the member of the expression series that 
confers the highest performance by using either the mid-logarithmic phase 
growth rate or final stationary phase cell density as a metric. It is also 
important to evaluate the expression series driving the transcriptional 
reporter under the condition of interest, especially if it is very different 
from common laboratory conditions where these promoters are usually 
assayed. 

For goals of enhanced production rather than tolerance, studies in E. coli 
have shown that biomass formation is often correlated with product yields, 
particularly for metabolites or recombinant molecules synthesized through 
fermentative pathways (Babaeipour et ah, 2008). Thus, growth may also 
serve as an appropriate proxy for production. However, conflicting data 
also exist suggesting that output may not necessarily be coupled to growth 
under certain cultivation conditions (Lutke-Eversloh and Stephanopoulos, 
2008). Given these uncertainties, it is best to directly read out product 
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formation whenever possible. For example, visual or absorbance-based assays 
may be easily developed that allow for fast and specific detection of the 
product of interest (Santos and Stephanopoulos, 2008; Yu et ah, 2008). 

3.2. Random mutagenesis by PCR 

Packaged solutions for generating pools of sequence variants by error- 
prone PCR are commercially available and simplify much of the handling 
and reagent optimization previously needed to achieve specific mutation 
frequencies. Enzymes mixtures have been further optimized for less muta- 
tional bias, and one need only supply primers and a plasmid template to 
obtain a large library of sequence variants (~ 10 —10 ) within several days' 
time. In the past, our laboratory has extensively used the GeneMorph II 
(Stratagene, La Jolla, CA) random mutagenesis kit where PCR fragments 
are manually gel-purified, restriction-digested, and ligated into the back- 
bone of the plasmid template. Recently, we have begun using the newer 
GeneMorph II EZClone (Stratagene) domain mutagenesis kit that, in 
brief, bypasses the restriction and ligation steps by using the mutagenized 
PCR products themselves as "megap rimers" to anneal to and completely 
replicate the original plasmid in a second iteration of PCR. As a tradeoff, 
the EZClone kit (unlike the original) is formally restricted to the muta- 
genesis of sequences up to 3.5 kb; fortunately, the majority of 5. cerevisiae 
coding sequences are well under this limit. The procedure outlined here 
thus closely follows and refers to the official GeneMorph II EZClone 
protocol; however, we provide commentary and a few changes custo- 
mized to our application. Furthermore, because equivalent steps can, in 
fact, be performed using material from the original GeneMorph II kit plus 
reagents supplemented separately, these substitutions will be documented, 
as well. 

The plasmid selected from the expression series containing the optimally 
promoted wild-type transcription factor serves as the template for error- 
prone PCR. The template DNA must be methylated to allow for digestion 
by Dpnl in a subsequent step; therefore, the use of dam E. coli strains for 
preparation of the plasmid template is required. Design PCR primers to flank 
the entire coding sequence of the transcription factor. However, if there is 
evidence suggesting that particularly advantageous phenotypes may arise 
from mutations enriched in particular regions (e.g., DNA-binding domain 
or protein-binding interface), design primers to amplify those specific seg- 
ments. Furthermore, because errors are generated with each primer exten- 
sion, mutations accumulate with each additional cycle of PCR and the final 
average mutation frequency is directly proportional to the total number of 
fragment duplications. Thus, one can establish the library mutation fre- 
quency simply by changing the total number of PCR cycles or, better yet, 
by the input amount of template DNA (for more consistent product yields). 
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Finally, combined experimental and modeling studies have shown that an 
optimal, protein-specific mutation frequency exists that maximizes sequence 
diversity while retaining protein function (Drummond et al, 2005). There- 
fore, it is recommended that one follow the suggested practice of construct- 
ing multiple libraries with different average mutation frequencies to maximize 
the probability of isolating enhanced mutants. 

To perform mutagenesis by error-prone PCR, follow the Stratagene 
instructions and prepare 50 jA reactions corresponding to low (0—4.5 muta- 
tions/kb; use ~1 /ig initial DNA), medium (4.5—9 mutations/kb; use 
~300 ng initial DNA), and high (9—16 mutations/kb; use ~50 ng initial 
DNA) mutation frequencies. The values listed in parentheses indicate the 
initial target amount of DNA — not the total input amount of plasmid. For 
example, to produce full-length variants of a 2 kb transcription factor cloned 
into a plasmid with a total length of 8 kb at the medium mutation 
frequency, use 1.2 fig of plasmid in the sample reaction. Since it is the 
nontemplating components of the PCR mixture that ultimately becoming 
limiting, final product yields will typically all be similar regardless of the 
different initial target amounts of DNA used to induce the different 
proportions of substitutions. 

After proceeding with the Stratagene-suggested 30 cycle program, it is 
important both to quantify the yield and inspect homogeneity of the PCR 
product (also referred to from here on as megap rimers) by gel electropho- 
resis. Separate the mutagenized fragments from the plasmid template on 1% 
agarose by running out 5 fi\ of each reaction along with 50 ng of the 
included 1.1 kb standard or other appropriate standards for DNA quantifi- 
cation. To verify that the expected mutation frequencies were achieved, it is 
generally sufficient to visually confirm that the quantity in the product band 
lies between 50 ng and 1 fig. However, for a more accurate estimate, 
perform the quantification by densitometry of the product band and a 
standard curve of known DNA concentrations (of a fragment length similar 
to the amplicon), and interpolate off the manufacturer's published linear 
regression of mutation frequency versus duplication (usually located in the 
instruction appendix). 

Once average mutation frequencies have been confirmed, purify the 
remaining 45 jA of megaprimers by standard gel extraction methods. It is 
recommended that each sample be divided across at least two lanes to avoid 
loss through overloading of the gel or saturation of the DNA binding 
columns. The Stratagene protocol indicates that direct column purification, 
as an alternative, is permissible for PCR samples that use miniscule amounts 
of template (i.e., high mutation frequency reactions using <50 ng initial 
target DNA). Technically, this step is justified for our application since the 
second (megapriming) PCR will, in fact, be using the same plasmid tem- 
plate. However, it is better practice to remove all contaminating traces of 
the original plasmid in order to have stricter control over its amount in the 
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second PCR reaction mixture. This issue is particularly relevant for the 
eventual elimination of methylated DNA by Dpnl digestion: because the 
plasmid template is methylated (i.e., prepared from dam E. coli strains), it is 
important to minimize its amount and the resulting proportion of hemi- 
methylated — but correctly megaprimed — plasmids generated by the second 
PCR that will also be subject to digestion. 

Next, quantify the concentration of the gel-extracted megaprimers by 
agarose gel electrophoresis. As before, visualize a 1—5 }A aliquot (depending 
on the initial concentration) alongside 50 ng of the included gel or other 
appropriate standard, or perform densitometry accompanied by a standard 
curve of known DNA quantities. As an alternative to gel-based quantification, 
megaprimer concentrations — now free from plasmid contaminants — may also 
be assessed by an ultralow volume spectrophotometer (e.g., Nanodrop — 
Thermo Fisher Scientific, Wilmingon, DE). 

For the second iteration of PCR ("EZClone Reaction"), follow the 
Stratagene instructions and prepare as many reactions as necessary (typically, 
at least five samples per mutation frequency) to exhaust the entire volume of 
purified megaprimers. Use the original plasmid as template, and proceed 
with the recommended 25 cycle plasmid amplification program. If the 
supply of EZClone enzyme and solution mix have been depleted, or the 
megaprimers were produced using a means other than the GeneMorph II 
EZClone kit, one can obtain similar results using a substitute high-fidelity 

Tl\/l 

and high-processivity enzyme such as the Phusion High-Fidelity 
DNA Polymerase (New England Biolabs, Ipswich, MA). Be sure to make 
adjustments to the temperature cycling program (e.g., extension times) as 
necessary. 

After completing the plasmid amplification, the product pool will be a 
collection of nicked plasmids, the majority of which will be unmethylated, 
doubly/stagger-nicked, and contain the mutagenized fragments. In the 
minority will be the methylated template and singly nicked, hemimethy- 
lated plasmids containing one extended megaprimer. Prepare the reactions 
for Dpnl digestion of the methylated and hemimethylated species by pool- 
ing all PCR samples corresponding to the same mutation frequency and 
cooling to below 37 °C. Add 10—20 U of Dpnl restriction enzyme per 50 jA 
PCR reaction directly to the pooled samples (i.e., no need for a Dpnl- 
specific buffer), mix thoroughly, and incubate at 37 °C for 2 h. Heat 
inactivate each sample at 80 °C for 20 min, and concentrate and purify 
the mutated plasmids by standard DNA column purification. 

As repair and replication of nicked DNA are carried out in E. coli, 
introduce the purified, mutated plasmids into the included XLIO-Gold 
ultracompetent cells by following the Stratagene instructions. To maximize 
the diversity of the resulting library, prepare as many transformations as 
necessary to consume each stock of plasmid while attempting to use similar 
DNA concentrations and volumes for all reactions. As an alternative to 
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XLIO-Gold, any ultracompetent (chemically or electro-) cells will serve the 
same purpose provided that the highest yield recipients are used. Electro- 
poration generally gives the highest efficiencies, but be reminded that high 
volumes of DNA will alter the overall salt concentration and electrical 
resistance of the sample, and thus dramatically decrease transformation 
efficiencies (Ausubel, 2001). 

After DNA uptake, add outgrowth medium to each reaction, and again 
pool all samples corresponding to the same mutation frequency. Estimate 
the total outgrowth culture volume V Q as best as possible (for determination 
of library size), and incubate at 37 °C for 1 h with agitation to allow for 
expression of the resistance gene but minimal cell replication. Plate 50 jA of 
a 1:10—1:100 dilution (or an appropriate dilution for reliable colony counts) 
on each of three LB agar plates containing the relevant antibiotic, and allow 
colonies to form by incubating at 37 °C for at least 16 h. 

For the remaining outgrowth culture (~ V ), inoculate into 500 ml of 
liquid LB medium containing the appropriate antibiotic and immediately 
measure the diluted cell density by absorbance (OD 600 ). Grow at 37 °C 
with shaking until saturation and measure the final cell density OD 600 f to 
estimate the fold-increase in cell number. This culture expansion will 
homogeneously amplify all mutants in the library under the assumption 
that variants of a yeast transcription factor will not elicit significant differen- 
tial rates of growth in E. coli. 



3.3. Quantification of total sequence diversity and 
library maintenance 

Count the number of individual bacterial colonies on each of the three agar 
plates to determine the mean number of successful transformants per 50 fA 
of diluted outgrowth culture. As these platings have been performed after 
outgrowth but before significant cell replication, each colony should repre- 
sent a unique transcription factor variant. Scale this average up to the total 
outgrowth volume V to arrive at the total library size. Based on previous 
efforts, sequence diversity in the range of 10 or above is satisfactory. 
Furthermore, it is informative to pick at least 10 random colonies from 
these agar plates for small-scale preparations of plasmid DNA and to 
sequence the transcription factor inserts to confirm that the desired muta- 
tion frequencies were attained. 

To prepare frozen bacterial stocks that contain adequate representation 
of each library, a statistical model based on a Poisson distribution of variants 
predicts that a minimum threefold excess of clones is needed to ensure 95% 
coverage (Patrick et ah, 2003; Reetz et ah, 2008). The amount of saturated 
culture V s needed to sample the library once over is simply the total culture 
volume divided by the fold-increase in cell density: 



Transcriptome Engineering 519 



K = Vo YllXl = 500ml (9R^A (20.1) 

OD 60 o,f/OD 60 o,o VOD 6 oo )f y V ; 

For example, an outgrowth culture diluted to OD 600 = 0.03 and 
grown to OD 600 f = 3 would require V s = 5 ml of the final culture to 
encapsulate a postamplification equivalent of clones. Incidentally, because 
V s is the proportion of V Q that was diluted and subsequently amplified, the 
changes in cell density are directly counterbalanced and thus V cancels in 
the final calculation. 

This calculation also assumes that 100% of the cells in the outgrowth 
culture take up plasmid and contribute to expansion of the culture; 
in reality, this is unlikely to be the case and only a smaller fraction is 
viable. Fortunately, this means that the fold-increase is likely larger than 
OD 600 f /OD 600 , and V s is, in fact, a conservative estimate of the volume 
needed for 1 X library sampling. To achieve the target minimum 3 X over- 
sampling, concentrate the saturated culture by centrifugation for 10 min at 
4000 X g, 4 °C, and resuspend the cells in a volume such that 1 ml will 
contain a threefold equivalent of the library (i.e., concentrate the culture by 
a factor of 3 X V s ). From the example above where V s = 5 ml, pellet the 
500 ml culture, and resuspend in 33.3 ml of LB containing antibiotic. To 
several 2 ml cryogenic vials, aliquot 1 ml of the concentrated culture, add 
sterile glycerol to produce a final concentration of 15%, and freeze imme- 
diately at — 80 °C. Each frozen stock will thus provide at least 3x coverage 
of the library, and may be thawed and used as an inoculum for further 
propagation of the plasmid library in the future. 

Centrifuge the remaining volume of culture and perform a large-scale 
preparation of plasmid DNA. Based on a typical plasmid size of 5— 15 kb, a 
final DNA concentration on the order of 1 fig/fil is equivalent to ~10 
plasmids//il. Thus, for a library of size 10 , 1 fA of plasmid should provide 
approximately 10 copies of each transcription factor variant. However, the 
transformation efficiency specific to the protocol and strain of yeast will 
eventually be the limiting factor, and must therefore be determined 
(outlined in the next section) before proceeding to the selection phase. 




4. Assessment of Phenotypic Diversity 

Although one can estimate the number of unique clones contained 
within a plasmid library, its sequence diversity is unlikely to translate into 
the same number of unique cellular phenotypes in yeast. For example, in 
addition to synonymous substitutions, there may be combinations of muta- 
tions that elicit such minor structural changes that the protein is effectively 



520 Felix H. Lamefa/. 

unchanged in function. Conversely, there may be mutations so deleterious 
that the cell is rendered inviable. Depending on the ultimate phenotype of 
interest, the selection process may also require a large investment of time 
and/or reagents. Thus, some strategy of accessing and preevaluating the 
range of phenotypic consequences contained within the various libraries 
would be extremely valuable by allowing comparison and vetting of the best 
candidates to proceed with. 

A method and corresponding metric developed specifically to quantify a 
library's inherent adaptation potential is the phenotypic diversity (Klein- 
Marcuschamer and Stephanopoulos, 2008). In brief, a cellular property is 
chosen based on two criteria: (1) it is the culmination of the complex 
interplay between numerous cellular processes, and (2) it is quantifiable in 
a relatively high- throughput manner. For example, quantities such as 
growth rate or cytosolic pH can be easy readouts that integrate a multitude 
of intracellular signals pertaining to cellular health (Karagiannis and Young, 
2001; Klein-Marcuschamer et ah, 2009). The property is then measured 
separately in populations of untransformed host cells and cells transformed 
with a library, and the phenotypic diversity is calculated by comparing the 
means and dispersions of the two distributions. The phenotypic diversity is 
formally a statistical measure of the additional behavioral heterogeneity that 
has been manifested over the host population. Performing this procedure 
with libraries of different mutation frequencies can provide an indicator for 
which collections have been under- or overmutagenized and a method for 
ranking which libraries demonstrate the greatest phenotypic potential. 
Indeed, it has been shown in bacteria that this value is highly correlated 
with the probability of isolating improved strains (Klein-Marcuschamer and 
Stephanopoulos, 2008). Furthermore, these rankings can be revised or 
strengthened by redetermining phenotypic diversity values under different 
conditions, perhaps even those approximating the eventual selection pro- 
cess. Finally, because phenotypic diversity is a normalized score — it is always 
calculated relative to the untransformed population under a specific condi- 
tion — it is, in theory, valid to extend the use of this metric to the compari- 
son and rating of libraries constructed from different transcription factors. 



4.1. Determination of yeast transformation efficiency 

To obtain statistically meaningful numbers of yeast clones for the evaluation 
of phenotypic diversity (and ultimately, selection), the transformation effi- 
ciency specific to the particular host strain of S. cerevisiae intended for the 
selection process must be estimated. Many researchers typically use labora- 
tory standard strains; however, one can also use industrial or feral strains if 
selection for the plasmid can be arranged. Unfortunately, nondomesticated 
strains often transform poorly, and even laboratory strains can vary widely in 
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transformation efficiency, particularly if premodified extensively for specific 
applications. 

As a starting point, use a high-efficiency yeast transformation protocol 
(Ausubel, 2001; Gietz and Schiestl, 2007) and 1 fig of the plasmid contain- 
ing the wild-type transcription factor. Since transformation efficiencies do 
not necessarily scale linearly with plasmid DNA concentrations, it is impor- 
tant that this value is determined using DNA quantities similar in range to 
what will eventually be used for transforming the libraries. Assuming that 
approximately 2—5 OD 600 units (~10 —10 cells) of mid-logarithmic phase 
cells are used (per typical protocols), plate 100 fi\ of 1:100 and 1:500 
dilutions in triplicate on agar selection plates. Incubate all plates at 30 °C 
for 24—48 h until colonies are visible. Count the number of individual 
clones and determine the mean number of transformants resulting from 
the fraction of the total reaction plated (e.g., 100 fi\ from a 1:500 dilution of 
a 1-ml cell resuspension is 0.02%). Scale this mean to the total size of the 
transformation reaction to arrive at the number of transformants//lg of 
plasmid DNA. 

This value, specific to the host strain and protocol, is particularly important 
at the selection stage for ensuring that all transcription factor variants in the 
library are represented. For example, given an efficiency of 10 yeast transfor- 
mants//ig of plasmid and a library size of 1 mutants, it would be wise to use at 
least 30 fig of plasmid DNA (e.g., distributed across 15 reactions using 2 fig 
each) to cover 95% of the library with confidence (Reetz et ah, 2008). 

4.2. Evaluation of mutant libraries 

The following section outlines the procedure of assessing phenotypic diver- 
sity using growth rate as the per-clone variable, and specifically, the quanti- 
fication of colony areas from agar plates as a proxy. From the transformation 
efficiency determined previously, individually transform each library into 
the host strain of 5. cerevisiae using an amount of plasmid DNA sufficient to 
generate a minimum of ~10 —10 clones. For the purpose of assessing 
phenotypic diversity, it is acceptable to dramatically undersample a library 
because one is interested only in population parameters that can be esti- 
mated from smaller, random samples. Include two controls: a sample with 
untransformed host cells and one transformed with the plasmid containing 
the wild-type transcription factor. In triplicate, plate fractions of each 
reaction on solid selection medium appropriate to yield approximately 
100—200 colonies per plate. For the untransformed host cell control, plate 
a dilution to yield roughly the same number of colonies on an equivalent 
agar plate without selection. Incubate all plates at 30 °C for 24—48 h until 
individual colonies are visible. It is imperative, however, that colonies are 
not allowed to overgrow. Otherwise, the variation in colony areas will be 
minimized and the utility of this assessment will be diminished. If, for 
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unexpected reasons, the number of colonies is too high such that the density 
precludes accurate area quantification, or too low such that the analysis will 
lack statistical power, repeat the transformations using a corrected amount 
of plasmid DNA for improved data. 

Photograph all agar plates immediately using a digital imaging system 
(e.g., Alpha Innotech, San Leandro, CA), or place at 4 °C to prevent 
significant further growth until ready. Perform computational segmentation 
on each image to distinguish individual colonies. Numerous image proces- 
sing options are available including such packages as MetaMorph (Molecular 
Devices, Sunnyvale, CA) or the Image Processing Toolbox for MATLAB 
(The Mathworks, Natick, MA). 

4.3. Calculation of phenotypic diversity 

From each segmented image, extract the area in pixels of each identified 
object/colony. Numbers deriving from the same transformation or control 
sample (e.g., those belonging to the same triplicate plating) may be merged 
into a single data set. Within each set, take the positive differences of the 
natural logarithms over all i 9 j pairs of areas: 



D = 




ViJ (20.2) 



where A { is the area of colony i and D is a set consisting of computed unitless 
values termed "phenotypic distances." 

If D is far from approximating a normal distribution or its shape is suffi- 
ciently jagged due to insufficient data points, bootstrap resampling methods 
may be employed to generate a better-behaved distribution of population 
parameters which encapsulates the variability contained in the parent popula- 
tion. For example, subsets of D may be randomly selected with replacement 
(bootstrap samples) and their means computed. The resulting set d of "average 
phenotypic distances" will then be normally distributed (by the central limit 
theorem). Furthermore, the population dispersion contained in D will be 
manifested in the distribution width of d given a sufficient number of subsets 
that randomly sample the outer quantiles (Klein-Marcuschamer and 
Stephanopoulos, 2008) . The set d may then used in place of D in all subsequent 
calculations. However, if D appears sufficiently Gaussian or is symmetrically 
distributed (which may be the case if another clonal property is measured 
instead of colony area), it is not necessary to perform bootstrapping as the 
resulting phenotypic diversity rankings will be similar. 

A variety of indices can be used to quantify the dissimilarity/ divergence 
between the variability exhibited by cells transformed with a library over that 
exhibited by a control population (the control can either be untransformed 
cells or those transformed with the wild-type transcription factor). The 
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Bhattacharyya distance (BD) was chosen as the phenotypic diversity metric, 
which is computed between the two distributions as: 
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Here, [!]& and /i cont are the means of the phenotypic distances of the library and 
control populations, respectively, and <7{j b and o 2 cont are the corresponding 
variances. As an illustration, the analysis above can be implemented easily 
with just a few lines of code in MATLAB: 

>> Dlib = pdist(log(lib) ) ; % Calculate all pair wise log 

distances from library population 
>> Dlib = bootstrp ( length (Dlib ) , @mean, Dlib); % If 

bootstrapping 
>> mulib = mean (Dlib); varlib = var(Dlib); % Compute 

library mean and variance 
>> Dcont=pdist(log( cont ) ) ; % Calculate all pair wise log 

distances from control population 
>> Dcont = bootstrp ( length(Dcont ) , @mean, Dcont) ; % If 

bootstrapping 
>> mucont = mean (Dcont ) ; varcont = var (Dcont ) ; % Compute 

control mean and variances 
>> BD = . 125* (mulib--mucont ) ~2*2/ ( var lib+varcont ) + 

. 5*log( (var lib+varcont ) /2/sqrt (varlib* var cont ) ) % 

Compute Bhattacharyya distance 

Here, lib and cont are column vectors that contain colony areas derived 
directly from image segmentation of the library and control platings, respec- 
tively. Calculating BDs between the same control population and each of the 
libraries with different mutation frequency thus provides the quantitative 
scores needed to rank the libraries. 

Equation (20.3) is a simplified, univariate form of the BD. A more 
general, matrix form also exists that can be used to calculate BD when 
data is collected for each library and the control under multiple conditions 
(e.g., synthetic media, synthetic media + 5% ethanol): 
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Here, /i^b an d /^ CO nt are vectors that contain the means of the phenotypic 
distances in condition 1, condition 2, etc. for the library and control popula- 
tions, respectively. The corresponding covariance matrices S^b an d ^ CO nt are 
derived from matrices of phenotypic distances where column 1 refers to 
condition 1, column 2 to condition 2, etc. A MATLAB implementation of a 
multivariate scenario can also be fairly straightforward (illustrated here with 
two conditions and bootstrapping omitted): 

»Dlib = [pdist (log(libl) )'pdist ( log( lib2 ) )'] ; 

>> mulib = mean (Dlib ) ; covlib = cov (Dlib ) ; 

>> Dcont = [pdist(log( cont 1) ) / pdist(log( cont2 ) )'] ; 

>> mucont = mean (Dcont ) ; covcont = cov ( Dcont ) ; 

>>BD = 0. 125* (mulib --mucont ) *inv ( ( covlib +covcont ) /2 ) 

* (mulib --mucont ) / +0.5*log(det(( covlib+covcont ) /2 ) / 

sqrt(det(covlib)*det( covcont ) ) ) 

Here, libl and lib2 are column vectors with the same number of rows 
containing data derived from platings of a library under two different condi- 
tions, and contl and cont2 are column vectors with its own matching number 
of rows containing data from platings of a control under the same two condi- 
tions. Of course, in lieu of Eq. (20.4), it is also appropriate to compute Eq. (20.3) 
independently for each condition; it is likely, then, that the scores between 
conditions will covary. However, if a condition is identified that breaks the 
correlation in rankings, it may be indicative of a fundamentally different 
transcriptional program mediated by the mutated transcription factor and 
warrant further investigation once specific mutants are isolated. 




5. Selecting for Phenotypes of Interest 

Since one generally gets what one selects for in a mutant search, it is 
important to ponder the selection strategy to consider potential outcomes. In 
our experience, enhanced tolerance is typically more straightforward to 
achieve as the selection pressure (i.e., survival) is directly coupled to the 
resulting phenotype. In contrast, enhanced production, particularly of toxic 
products, requires the abilities to both withstand and continue output in 
increasingly harsh conditions and, therefore, may be more difficult to attain. 
For example, yeast isolated after prolonged exposure to a combination stress 
such as elevated ethanol and glucose are, by definition, more resistant than the 
parental strain. However, they may survive by upregulating a variety of 
membrane remodeling and osmoadaptive mechanisms that allow them to 
continue metabolizing and growing, or they may enter a quiescent phase that 
allows them to fortify and lie dormant until more favorable conditions are 
detected (Gray et al, 2004). The former scenario would likely allow the 
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continued output of fermentative products, such as ethanol, while the latter 
would not. As such, the following steps describe a selection primarily for 
gaining improved tolerance (using elevated ethanol and glucose as an example) , 
and discussion is offered on how improved production phenotypes may also be 
achieved. 

5.1. Creation and maintenance of yeast library 

Choose the one or two libraries with the highest phenotypic diversity scores, 
and transform the host yeast strain with an amount of plasmid DNA 
corresponding to at least 3 X library coverage. To recover the maximal number 
of variants, plate the entire reaction on a series of large-format dishes 
(~50 X 150 mm (D) X 10 mm (H), or ~15 X 245 mm (L) X 245 mm 
(W) X 18 mm (H)) containing solid selection medium. Unlike typical trans- 
formations requiring comfortable margins between colonies, higher densities 
are acceptable here as the entire population of plasmid-containing cells will 
ultimately be pooled. Allow growth at 30 °C for a duration sufficient to 
minimize the difference in colony size between slower and faster growing clones 
(~96 h). Incidentally, the rationale for performing plasmid selection on agar 
plates instead of in (more convenient) liquid medium arises from the pheno- 
typic diversity analysis and the potentially large range of growth rates known to 
result from the expression of transcription factor variants. Clonal expansion in 
liquid medium would generate widely different proportions of each mutant 
and, thus, result in disproportionate representation in the selection process. 

Mechanically harvest all colonies off the entire series of plates and pool into 
liquid selection medium. If necessary, concentrate the cells by centrifugation, 
and resuspend to a cell density encapsulating the total number of variants by at 
least threefold. For example, again assuming a library size of ~10 variants, 
resuspend cells thoroughly to an OD 600 « 1 (^10 cells/ml). Prepare 15% 
glycerol stocks containing at least 1 OD 600 unit of cells per cryogenic vial, and 
freeze immediately at —80 °C (Ausubel, 2001). Cells may be revived by 
thawing, inoculating into liquid plasmid-selection medium, and growing 
briefly at 30 °C (4—6 h maximum) to allow for cell recovery but minimal 
replication. Cultures are then ready for treatment appropriate to the phenotype 
of interest. These glycerol stocks should not be used as inocula for subsequent 
propagation of the yeast library. Regenerating the full population of yeast 
variants should be done by retransforming library plasmid DNA into fresh 
host cells and expanding the clones on solid selection medium. 

5.2. Basic selection on liquid versus solid media 

To perform a basic selection in liquid medium, prepare a 25—30 ml culture 
containing the plasmid-selection condition and the condition of interest (e.g. , 
YSC -URA +100 g/1 glucose +5% ethanol). Inoculate the yeast library to an 
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OD 600 ~ 0.05 and incubate for several days to a week at 30 ° C. Follow growth 
by OD 60 o and subculture into fresh selection medium as necessary. To recover 
surviving mutants, streak for discrete colonies on solid medium containing the 
plasmid-selection condition but without the condition of interest (e.g., YSC— 
URA). 

As an alternative to liquid culture, selection may also be performed on 
solid medium if the condition of interest is amenable to a 2% agar mixture. 
In this scenario, spread the entire yeast library across several plates and 
depending on the density of colonies that form, restreak or velvet-copy 
onto fresh selection plates as necessary. Allow growth at 30 °C for a total of 
several days to a week. If the condition of interest contains a volatile 
ingredient such as ethanol, placing a liquid drop (~100 /A) of the pure 
component faceup in the plate can compensate for evaporation over the 
course of selection. Although liquid cultures ensure greater homogeneity of 
conditions, agar plates have the benefit of allowing for observation of 
individual clonal behavior such as growth rates. 

From yeast isolates emerging from the selection process, randomly select 
10—20 individual clones for further analysis. An important first step is to 
quantify the actual improvement in behavior imparted by the mutated 
alleles over the wild-type transcription factor. For a phenotype like 
enhanced tolerance, this involves characterizing individual growth rates 
and/or maximum cell densities in the condition of interest. First, generate 
two reference strains by individually transforming an empty plasmid and the 
plasmid containing the wild-type transcription factor into the host yeast 
strain and selecting for individual transformants. Next, create liquid starter 
cultures of all strains by inoculating from single colonies and growing cells 
until saturation. Prepare fresh cultures with medium containing the condi- 
tion of interest, inoculate to a common density from the starter cultures, and 
follow the increase in biomass by OD 600 until stationary phase. Parameters 
extracted from the growth curves will thus allow ranking of the mutants 
showing the most improved performance over the wild type. In addition, 
the data should reveal if the extra copy of the wild-type transcription factor 
is sufficient to influence the phenotype of the host yeast strain under the 
condition of interest. 

5.3. Alternative selection strategies and 
postselection screening 

Readers are encouraged to consider variations on the basic selection to 
increase the probability of isolating higher performing mutants. For exam- 
ple, instead of conditions that remain static, one scheme is to gradually 
increase the strength of the perturbation over the course of selection. 
Certain response mechanisms may operate by remodeling cellular structures 
or processes and require longer adaptation periods; thus, cells may be 
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able to withstand smaller, incremental stresses instead of a single large shock. 
For example, glucose- and ethanol-tolerant yeast strains were isolated 
through a selection starting at 5% ethanol and 100 g/1 glucose and ending 
at 6% ethanol and 120 g/1 glucose through multiple subcultures (Alper 
etal., 2006). 

Particularly for phenotypes involving combined tolerance to multiple, 
distinct conditions, alternative adaptation sequences — instead of a single 
selection that presents all stresses simultaneously — may also yield higher 
performing mutants. In an effort to generate combined tolerance to ethanol 
and SDS in E. coli using variants of rpoD, various search trajectories were 
directly investigated to identify the selection path giving rise to the most 
doubly resistant strains (Alper and Stephanopoulos, 2007). The following 
strategies were compared: (a) a doubly tolerant strain was directly selected 
in ethanol + SDS (reference scenario); (b) an ethanol-tolerant strain was 
first isolated, its specific allele used as a template for a new round of 
mutagenesis, and a second selection was done in ethanol + SDS; (c) an 
SDS-tolerant strain was first isolated, its specific allele used as a template for a 
new round of mutagenesis, and a second selection was done in ethanol + 
SDS; and (d) ethanol- and SDS-tolerant strains were isolated individually in 
parallel, and their respective alleles subsequently isolated and coexpressed. 
Surprisingly, it was observed that scenario (d) resulted in the greatest 
improvement, and that coexpression conferred the two full phenotypes 
independently of one another. To our knowledge, such a parallel adaptation 
and combination strategy has yet to be attempted in 5. cerevisiae, and it is 
formally possible that the pure additivity of traits was a consequence of 
functional decoupling specific to the two rpoD alleles isolated (one full- 
length, one truncated). Nonetheless, these results from E. coli serve as a 
successful demonstration that variations on a selection strategy can have 
considerable influence on the resulting behaviors. 

While enhanced tolerance to a condition of interest is a direct outcome 
of the environmental pressure applied during selection, improved produc- 
tion phenotypes typically are not. For example, it is generally believed that 
yeast cells able to endure prolonged exposure to elevated ethanol do not 
necessarily experience a concurrent demand to increase ethanol output (as it 
would contribute little to survivability). There may even be a small pressure 
to inhibit production if continued output begins to increase ethanol con- 
centrations). Likewise, any hypothetical selection scheme that is able to 
couple higher production to survival would likely culminate in respiration- 
enhanced strains that can easily transition to ethanol metabolism — scenarios 
that would all minimize the accumulation of ethanol. 

In general, it is thought that there is neither strong selection for 
nor against ethanol production during prolonged exposure to elevated 
concentrations. Assuming that ethanol remains a neutral by-product of 
glucose metabolism, yields may be loosely associated with growth rates. 
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Strains endowed with improved resistance may thus harbor a potential for 
producing more ethanol as higher levels are, presumably, less disruptive to 
their normal metabolic processes. Therefore, it is suggested that enhanced 
production phenotypes be screened for among the tolerance-improved 
isolates. For example, enzymatic assay kits are available for ethanol (R- 
Biopharm, Darmstadt, Germany) that enable feasible comparison of dozens 
of candidates. Mutants with improved tolerance can be individually fer- 
mented in small cultures containing the condition of interest, and specific 
ethanol titers from the media easily quantified in vitro. If commercial solu- 
tions do not exist for the product of interest, it may be possible to develop 
visual or dye-based approaches that offer both specific detection and rea- 
sonable throughput (Santos and Stephanopoulos, 2008; Yu et ah, 2008). 




6. Validation 

Most of the strains emerging from selection will have achieved their 
unique properties through an altered transcriptome mediated by a mutated 
transcription factor. However, it is possible that some phenotypes were 
partially underwritten by mutations in the genome incurred during the 
course of adaptation, a possibility that increases if the selection was particu- 
larly harsh. Additionally, transformation itself can be mutagenic. Dissecting 
the source of the enhanced behavior requires isolation of the plasmids 
carried by these strains, and testing for transferability of the phenotype to 
fresh hosts of the parent genetic background. 

To rescue the plasmids containing the mutated transcription factor from 
the improved strains, perform a basic genomic DNA preparation (e.g., 
"smash and grab" protocol) and directly transform competent E. coli cells 
(Ausubel, 2001). Alternatively, timesaving commercial kits are available that 
yield relatively pure plasmid DNA straight from yeast cultures (e.g., Zymo- 
prep II — Zymo Research, Orange, CA). Retransform these constructs 
into fresh cultures of the host yeast strain, and determine from growth rate 
or product yield assays if the phenotype of interest can be recapitulated. For 
constructs showing partial or full transferability of phenotype, repurify the 
plasmids and sequence the inserts to identify the responsible sets of 
mutations. 

In the event that the observed phenotype is different from that of 
the selected strain, it is likely that chromosomal mutations contributed 
to the improved behavior either in conjunction with, or independently 
of, the plasmid-borne transcription factor variant. Fortunately, identifying 
these potentially obscure changes in the vastness of the host genome 
is becoming increasingly tractable with current and next-generation 
DNA profiling technologies. For example, microarray-based comparative 
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genomic hybridization can detect rearrangements (e.g., deletions and 
amplifications) at the single-gene level (Dunham et ah, 2002; Watanabe 
et ah , 2004) while whole genome deep sequencing can resolve alterations at 
the single-nucleotide level (Bentley, 2006; Liti et ah, 2009). By comparing 
the evolved strain with the parental strain, these approaches promise insight 
into how complex, nonphysiological cellular behaviors can be synthesized 
with potentially minimal genetic changes. 




7. Concluding Remarks 

Although we have not tested in vivo cloning methods in our labora- 
tory extensively, we wish to apprise readers of the possibility of creating 
gTME yeast libraries directly from mutagenized linear DNA fragments. 
Given the ease of homologous recombination in S. cerevisiae, PCR pro- 
ducts containing 50 bp of sequence complementary to the 5' and 3' ends of 
a cotransformed, linearized vector can effectively be "ligated" in vivo by the 
endogenous gap repair machinery. The time savings can be substantial as 
the steps involving whole plasmid amplification, library preparation in E. 
coli, and plasmid transformation into yeast are all bypassed. Furthermore, 
efficiencies of approximately 10—10 transformants//ig of insert DNA and 
yeast library sizes of ~10 have been reported (Swers et ah, 2004). 
Although one would likely need to return to PCR mutagenesis to regen- 
erate a yeast library containing transcription factor variants of the same 
mutation frequency (e.g., in another host background), the benefits of this 
approach are many and should warrant consideration. 

With one or a set of phenotype-enhancing transcription factor variants 
isolated and characterized, numerous avenues are available postselection 
for gaining potentially further improvements (Neylon, 2004; Wong et ah, 
2007). The sampling of options mentioned here are all strategies typical in 
the practice of directed molecular evolution — strategies designed to extend 
the adaptation trajectory to superior properties by finer grain sampling of 
the sequence space in the region of the previous enhancement. 

Three example subsequent steps are provided here that can further 
enhance the desired phenotype. First, gTME can be applied in a directed 
evolution manner where beneficial transcription factor variants are used as 
templates in multiple iterations of mutant library creation and strain 
selection. Indeed, work in E. coli has shown that additional mutagenesis 
and selection cycles result in fine-tuning of the altered transcriptome: the 
enhanced phenotype is either maintained or improved while the number 
of genes perturbed actually decreases (Alper and Stephanopoulos, 2007). 
Second, a collection of phenotype-enhancing variants can indicate sites 
within the protein enriched for mutations and thus favorable for further 
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mutagenesis. Targeted saturation mutagenesis of key codons can quickly 
reveal combinations (including nonconservative substitutions) offering 
significant phenotypic increases that would otherwise have been inacces- 
sible by random point mutagenesis (Miyazaki and Arnold, 1999; Reetz 
et al., 2008). Third, a set of positive transcription factor variants can be 
subjected to in vitro DNA recombination techniques that allow for the 
coupling of beneficial mutations and the simultaneous elimination of 
deleterious or neutral substitutions. In a method such as the staggered 
extension process (StEP), the coding sequences of both the variants and 
wild-type serve as a mixed pool of templates in a thermal cycling program 
modified with very abbreviated primer extension phases. By randomly 
annealing incompletely extended fragments to different templates in each 
cycle, a collection of chimeric full-length sequences is generated that, 
when subjected to selection, can reveal optimized sets of mutations that 
display significantly improved phenotypes (Zhao, 2004; Zhao et al., 1998). 
This overview has described gTME as a mutagenesis and selection 
technique for generating industrially relevant phenotypes in S. cerevisiae. 
The capacity to elicit enhanced behaviors is increasingly valuable as the 
budding yeast is engineered further as a production platform for non- 
native or ever more toxic molecules. 
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Abstract 

We describe here optimized protocols for tagging genomic DNA sequences with 
bacterial operator sites to enable visualization of specific loci in living budding 
yeast cells. Quantitative methods for the analysis of locus position relative to 
the nuclear center or nuclear pores, the analysis of chromatin dynamics and the 
relative position of tagged loci to other nuclear landmarks are described. 
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Methods for accurate immunolocalization of nuclear proteins without loss of 
three-dimensional structure, in combination with fluorescence in situ hybridiza- 
tion, are also presented. These methods allow a robust analysis of subnuclear 
organization of both proteins and DNA in intact yeast cells. 




1. Introduction 

Quantitative imaging techniques have improved dramatically in the last 
15 years, reflecting both the rapid adaptation of naturally fluorescent proteins 
to cellular applications and improvements in fluorescence microscopy itself. 
Methods are also being continually optimized for the analysis and localization 
of endogenous proteins and chromosomal loci in living yeast cells. This 
involves novel microscope systems as well as improved computational tools 
for image analysis. Crucial to this process are tools for the rapid processing of 
the high-resolution digital-image stacks, since megabytes of data are produced 
in a single 3D time-lapse experiment on either a deconvolution widefield 
microscope or spinning disk (SD) confocal instrument (Horn et al, 2007). 

While techniques of live microscopy are powerful, it is not trivial to 
perform them correctly. Specifically, accurate visualization of more than two 
fluorophores at the same time can be difficult, and care must be taken to avoid 
damage by the light that is used for imaging. This can be particularly problem- 
atic when dealing with mutants that enhance sensitivity to damage or stress. 
Maintenance of unperturbed growth conditions and minimization of exposure 
time and light intensity are essential for meaningful results. Because high- 
resolution time-lapse microscopy often captures only one or a few cells per 
3D stack, the imaging step can itself take considerable time, rendering it 
difficult to obtain sufficient numbers of cells or to carry out large time-course 
experiments. If several strains are to be analyzed in parallel, it is recommended 
that cells be fixed by formaldehyde at the desired time points, so that the 
localization of proteins or DNA can be achieved later by immunofluorescence 
(IF) and/or fluorescent probe in situ hybridization (FISH). 

This chapter contains two sets of optimized protocols for the visualization 
of specific proteins and/or DNA sequences in budding yeast. The first set 
describes the targeting and analysis of proteins fused to the fluorescent protein 
GFP or its derivatives. The second section describes more classical methods 
for IF and/or FISH, which are sometimes the methods of choice for visualiz- 
ing different types of macromolecules at once. Basic methods for quantitative 
analysis of subnuclear position and chromatin dynamics are described. These 
methods have been optimized for the localization of one or several targets in 
the nucleus relative to DNA or the nuclear envelope (NE). We note that 
improvements are continually being made in these procedures and that future 
users should seek updates on the methodology in the literature. 
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2. Strain Constructions and Image Acquisition 
for Nuclear Architecture Analysis in Living 
Cells 

2.1. Tagging chromatin in vivo with lac and tet 
operator arrays 

The study of chromatin organization in live budding yeast cells often 
exploits the recognition of integrated arrays by fluorescently labeled bacte- 
rial DNA binding factors, usually the Lad or TetR repressor (reviewed in 
Belmont, 2001; Hediger et ah, 2004; Neumann et ah, 2006). The target 
arrays consist of anywhere from 100 to 256 copies of the recognition 
consenses (lacO or tetO). As few as 24 binding sites are usually sufficient to 
allow the formation of a visible spot, although the signal-to-noise ratio 
depends on the expression level of the fluorescently tagged binding protein. 

Tagging chromatin in vivo is a two-step process. The first step involves the 
expression of a fusion between a DNA-binding protein, a fluorescent protein, 
and a nuclear localization signal. Both integrative and episomal plasmids have 
been used to express these proteins (Michaelis et ah , 1997; Straight et ah , 1996) . 
Integrative plasmids give more reproducible levels of the fluorescently tagged 
proteins. The DNA binding Lac repressor is expressed as a fusion with green 
fluorescent protein (GFP), the cyan and/or yellow variants (CFP, YFP), and 
the Tet repressor exists as fusion proteins with GFP, CFP, YFP, and the 
monomeric variant of the red fluorescent protein, mRFP (Lisby et ah, 2003). 
To increase the fluorescence signal, a fluorescent protein can be introduced as a 
tandem array (3 X CFP, Bressan et ah , 2004) . Expression levels of these proteins 
have to be kept low, as overexpression elevates the background fluorescence, 
enhances non-specific binding, and can cause slow growth. 

The binding site arrays recognized by the fluorescently labeled repressors 
are repetitive and unstable by nature in both bacteria and yeast. To avoid 
recombination and loss of copy number, the bacteria (either DH5oc or 
recombination-deficient strains like SURE (Stratagene)) should be grown 
at 25 or 30 °C. When thawing bacterial strains, several colonies have to be 
tested for the size of the array by digestion of plasmid preparations with 
enzymes encompassing the array. The binding sites are inserted as an array in 
strains expressing the fluorescent DNA-binding proteins. For unknown 
reasons, expression of the DNA-binding protein in yeast stabilizes the 
array; therefore, it is recommended to transform yeast with the fusion 
protein construct prior to introducing the lacO or tetO sites. 

To date, three techniques have been used to insert arrays at specific loci 
in the yeast genome. The first technique is based on the cloning of a small 
PCR-generated fragment of genomic DNA (about 400—800 bp) into the 
array- containing plasmid (Fig. 2 1.1 A, Heun et ah, 2001a,b). This fragment 
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Figure 21.1 Outline of the methods for site-specific integration oflacO/tetO repeats in the genome cloning-free chromatin tagging. For details 
see text. 
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is chosen so that it contains a unique restriction site that is not present in 
the lacO/tetO plasmid. Once cloned into the array, digestion with this 
single-cutter enzyme will linearize the plasmid, which can be used for 
homologous recombination. The homology created by the small genome 
segment targets the plasmid to the desired genomic locus. It also creates 
direct repeats flanking the array, which might be detrimental to the stability 
of the array, as these allow popping-out of the whole plasmid by recombi- 
nation between the two direct repeats. Positive transformants are selected by 
resistance to a selective marker present on the plasmid then correct insertion 
is tested by PCR and/or southern blotting. During the transformation 
process, some binding site repeats may be lost, therefore transformed yeast 
colonies have to be screened microscopically for the presence of a bright 
spot. One should not store the resulting yeast strains at room temperature 
for more than a week and positive clones should be frozen immediately. 
Spot presence has to be reconfirmed after thawing. 

The second technique was developed to avoid tedious cloning steps with 
large plasmids containing lacO/tetO repeats (Rohner et al, 2008). It is a 
two-step process involving first PCR-based integration of a marker flanked 
by 100-bp tags at the locus of interest. Once the tags are integrated into the 
genome at the locus of interest, they can be used for homologous recombi- 
nation to integrate lacO/lexA repeats and a second selectable marker 
(Fig. 21. IB). To this end, the tags are cloned into the lacO/tetO repeat 
plasmid in reverse orientation with a rare cutting site in between them. 
When cut with this enzyme, the two adap tamers encompass the lacO/tetO 
repeats and can therefore be aligned with the tags flanking the marker in the 
genome. This technique is more flexible in terms of markers and allows one 
to tag the same locus with different binding sites without the need to 
redone a PCR fragment into an array-containing plasmid. 

A third technique combines the previous two and has been developed to 
avoid integrating a marker gene next to the repeats (Fig. 21. 1C; Kitamura 
et al, 2006). In a first step, a URA3 gene is inserted at the locus of interest 
using long primer PCR-mediated recombination. To achieve replacement 
of the URA3 gene, a fragment of about 700 bp corresponding to the URA3 
insertion site is cloned in the lacO/tetO repeat plasmid. As for the first 
technique described above, the recipient plasmid contains a single cut site 
in the middle of the cloned fragment. Transformation of the cut plasmid 
leads to replacement of URA3 by the repeats. In this case, positive colonies 
are selected by their ability to grow on 5-fluoro-orotic acid (FOA), which is 
toxic in the presence of URA3. The main drawback of this technique is that 
it does not allow selection of the colonies which still contain the array, for 
example, after freezing. Direct replacement of URA3 using an adaptamer- 
based technique with a marker-free plasmid is impossible, as FOA-resistant 
colonies arise more frequently than recombination events. 
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It is often useful to insert a low number of binding sites for another DNA- 
binding protein next to the lacO or tetO sites integrated at specific loci. This 
allows one to target another protein to the site of interest, which can be used 
to manipulate the locus. For example, integrated lexA sites allow binding of 
lexA fusion proteins, such as a lexA— Yifl fusion, that anchors the tagged 
chromatin locus to the NE (Taddei et al, 2004). Plasmids for the tagging 
methods described above are available with lexA binding sites located next to 
the lacO/tetO repeats (Rohner et al, 2008; Taddei et al, 2004). Other locus- 
tagging systems in development include a lambda repress or/ operator system 
(K. Bystricky, A. Taddei, personal communication). 

2.2. Determining the position of the nucleus 

For precise localization studies, as well as for studying chromatin dynamics, 
the nuclear volume has to be defined. This can be achieved either by 
expression of a nucleoporin fused to a fluorescent protein (commonly 
Nup49-GFP) or by using the nuclear background fluorescence created by 
the unbound TetR protein. LacI-GFP tends to give very little background 
even in the absence of a lacO array, probably due to its low expression level. 

2.3. Immobilizing cells for microscopy 

To obtain images which allow the reliable measurement of chromatin 
position and dynamics there are two central concerns. First, one must 
immobilize the yeast cells and second one must prevent distortion of cell 
shape by pressure from the coverslip or objective. Both are achieved by the 
following methods. 

For "snapshot" exposures where yeasts will be imaged only once, living 
cells are mounted on pad of agarose in synthetic medium. Immobilizing 
cells between agarose and a coverslip does not flatten or distort cells, while 
coverslip pressure on a glass slide does. Optimal agarose patches are created 
on depression slides, which have a concave depression in which the agarose 
and cells are placed. The agarose (1.4%) is dissolved in an appropriate 
medium (YPD gives more background than SD), and if imaging or cell 
maintenance lasts more than 20—30 min, it is recommended to use higher 
than usual levels of glucose (4% instead of 2%). Glucose can be locally 
depleted by cells in the agarose pad, while they are being imaged, and this 
reduces chromatin mobility within nuclei (Heun et ah, 2001b). Agarose 
prepared with yeast medium can be distributed in aliquots and kept for 
months at room temperature. 

1. Prior to use, agarose is dissolved in growth media at 95 °C for several 
minutes. The agarose should be liquid, but prolonged maintenance at 
high temperature increases background fluorescence. 



Quantitative Analysis of Subnuclear Position 541 



A B 






Coated 
coverslip 




Figure 21.2 Means to immobilize yeast cells for imaging. (A) Formation of a flat- 
topped pad of agarose dissolved in media on a depression slide. (B) Cell observation 
chamber (Ludin chamber, Life Imaging Services) with cells immobilized on the lectin- 
coated bottom glass coverslide is shown. 

2. Melted agarose is then poured into the depression of the slide. 

3. A normal slide is immediately placed across the top to remove excess 
agarose and create a flat surface on the pad (Fig. 2 1.2 A). While the 
agarose solidifies, 1 ml of an exponentially growing culture (at concen- 
trations <0.5 X 10 cells/ml) is spun in a microcentrifuge and resus- 
pended in 20 jA of appropriate medium. Cells can be grown in synthetic 
medium or YPD, but YPD cultures show more autofluorescence. Note 
that high cell density or glucose depletion alter chromatin dynamics 
(Heunef */., 2001b). 

4. After removal of the upper slide by sliding along the depression slide 
surface, 5 jA of the concentrated cells are placed on the agarose, and the 
pad is covered by a fresh coverslip. Capillary forces are generally strong 
enough to hold the coverslip in place. One should avoid fixing the 
coverslip with nail polish as some brands of nail polish contain solvents 
that inhibit yeast growth. 

For live imaging over longer periods of time, cells can be noncovalently 
immobilized on a coverslip coated with lectin and visualized in media 
in an observation Chamber (Ludin Chamber, Life Imaging Services, 
Fig. 21.2B) as described below. 

1 . For budding yeast, Concanavalin A (Sigma) is used at 1 mg/ml, while for 
fission yeast a lectin from Neisseria gonorrhoeae (Sigma, 1 mg/ml) is 
optimal. Coverslips (18 mm 0) are covered with 100 jA lectin solution 
which is immediately removed (the solution can be reused and kept at 
-20 °C). 

2. Coverslips are left to dry at room temperature (>20 min) and can be 
kept for months protected from dust. 
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3. These coverslips are used in an observation chamber (Ludin chamber) 
that allows cells to be immersed in media that can be exchanged by 
continuous flow or at defined intervals (Life Imaging Services, 
Fig. 21. 2B). 

4. Cells are sedimented on the coverslip before removal of excess media. 
One milliliter of fresh preheated medium is then added to the cells 
before sealing of the chamber. If needed, a flow of medium can be 
used, although very slow rates (flow < 1 ml/min) should be used as 
pressure changes induced by liquid pumping can cause movement of the 
coverslips or cells in the chamber. 



2.4. Controlling temperature 

Stable conditions for microscopy are best achieved by temperature- 
controlled rooms (±1 °C). The microscope stage itself can then be heated 
to the desired temperature (30 °C for wild-type strains) using a Plexiglas box 
that encloses the entire microscope stage (many providers now offer this 
option adapted to the specific instrument). Another method only heats the 
stage, but temperature control is less precise as an unheated objective can act 
as a heat sink and cool the sample during observation. Heated objectives are 
also available. 



2.5. Image acquisition set-ups 

The appropriate choice of microscope depends on the aim of the experi- 
ment. Whatever system is used, it is essential to check first that the cells 
survive the high-intensity light used for fluorescence illumination without 
damage or cell cycle arrest. The more subtle the monitored phenomenon is, 
the more extensive the controls must be for light-induced changes in cell 
physiology. The simplest assay is to compare the kinetics of cell cycle 
progression in cells subjected to the experimental pattern of illumination 
with nonimaged cells. Various time intervals, intensities of light, wave- 
lengths and/or gray filters should be tested; unbudded cells should rebud 
within 120 min at room temperature after imaging on YPD. 

Every microscopic system is a compromise between speed of acquisition 
(the higher the speed, the lower the amount of light that can be recorded), 
the field of acquisition (in general, the bigger the field, the slower the 
acquisition), and resolution (higher resolution decreases speed and signal, 
since each pixel on the image corresponds to a smaller part of the sample and 
more pixels take more time to acquire). Since the haploid yeast nucleus is 
only 1 /im in radius, it is recommended that the objective magnification is at 
least 63 X, or ideally lOOx, with a numerical aperture (NA) as high as 
possible (between 1.3 and 1.45). This allows a high-resolution camera 
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to obtain maximal detail from the sample (resolution power is inversely 
proportional to NA). 

The first image acquisition setup described here is based on an improved 
widefield microscope, with a monochromator that regulates the light 
source, combined with rapid, high-precision Z motor, and a rapid and 
highly sensitive CCD camera for image capture. Since there is no pinhole, 
light from out-of-focus planes will be recorded, which can be later used by 
deconvolution algorithms that recalculate position of the emitted light based 
on an ideal or measured light spread function. The main drawback of this 
system is the phototoxicity due to whole cell illumination. 

A second, widely available system is the laser-scanning microscope (Zeiss 
LSM510/710, Leica SP5). These systems have been proven very useful for 
acquiring very fast time-lapse recordings. Their limitation is the scanning 
speed, which is only fast enough for live imaging of chromatin dynamics if 
the field of scanning (region of interest or ROI) is reduced to a minimum 
(e.g., one cell). These confocals allow manual minimization of the beam 
intensity and pinhole. Again, there is a compromise between laser power 
(which increases phototoxicity, but allows more rapid image capture) and 
scanning speed (essential for the identification of rapid movements observed 
for chromatin in vivo). 

A third system that we strongly recommend is based on a rapid, wide- 
field high precision microscope, although the light source is a laser whose 
beam is focused on a rotating disk with thousands of pinholes. This disk 
spins at high speed dispersing the laser beam such that the whole laser power 
is never focused on a single point in the sample. This reduces phototoxicity 
and bleaching of the fluorochrome; moreover, the speed of capture is faster 
than that of a scanning laser system. Out-of-focus light is filtered through 
the pinholes, and entire fields of cells can be captured at once. In the 
following sections, we discuss the critical points of each of these setups. 

2.5.1. Rapid high-precision widefield microscopy 

For the imaging of a large number of cells at a single time point, best results 
are obtained with a high-precision widefield microscope. These micro- 
scopes are equipped with a piezoelectric focus either with the objective 
mounted on it directly (e.g., PiFoc, Physik Instrumente) or a piezoelectric 
table (e.g., ASI MS2000, Prior), which allows one to capture stacks of focal 
planes. Z distances between planes is carefully controlled and highly repro- 
ducible, and movement from one plane to the next is nearly instantaneous. 
The light source is very important, as the classical mercury bulbs show 
phototoxicity. The light source of choice for maximum versatility is a 
monochromator (Xenon light source coupled with Polychrome, TillVision), 
which allows excitation wavelength choice in nanometer steps (320—680 nm 
continuous spectrum, 20 nm window). Switching wavelengths is rapid 
(< 1 ms). A cheaper though less flexible illumination alternative is a LED- 
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based illumination (CoolLED, precisExcite), where up to four wavelength 
(fixed) can be chosen at the time of order. LEDs are very long lived (3 years 
guaranteed by the supplier), which makes it a cost-effective solution. Switch- 
ing time is even faster than with the monochromator (around 300 /is). From a 
performance point of view, we found no significant differences between a 
monochromator and a LED-based illumination system. 

Acquisition is achieved with a high-resolution CCD camera. To detect 
subnuclear or subcellular details, one needs a final pixel size between 60 and 
80 nm with a 100 X objective. The readout of the camera by the computer is 
often the rate limiting factor of the system. Typically, high-speed CCD 
cameras (Roper Scientific Coolsnap HQ, Andor IKon, Hamamatsu 
ORCA) achieve about 30 frames/s, which makes exposure times shorter 
than 30 ms impossible. These systems are relatively inexpensive and are 
easier to setup than confocal microscopes. Several proprietary software can 
drive the entire system (microscope, camera, shutters, monochromator) 
such as MetaMorph (Universal Imaging). 

This modified widefield microscopy is well-suited for scoring the posi- 
tion of a locus relative to another locus, or relative to a fixed structure 
(spindle pole body, nuclear periphery and nucleolus) in a large number of 
cells on an agarose pad. It is less well-suited for rapid, high-resolution time- 
lapse imaging, due to the high sampling and deconvolution that is needed 
for highest resolution data. If the position of two loci is to be monitored, 
either two different excitation colors have to be used (which increases 
the resolution power) or the spots have to be of significantly different 
sizes. 3D stacks of images are needed to evaluate the spatial positioning of 
the locus relative to another spot or to the nuclear periphery (see below). 
Optimal parameters for GFP imaging are excitation 475 nm, z-spacing 
200 nm with 20 plane stacks, 100—200 ms exposure time per slice. Due to 
the optical resolution of the microscope, it is not useful to sample more in 
the z-axis, which would also increase the acquisition time, and impair 
accuracy if the imaged locus is moving. 

For dual color imaging using CFP and YFP chromophores, optimal 
wavelengths are 432 and 514 nm, respectively, with exposure times of about 
200 ms. The two wavelengths should be acquired successively at each focal 
plane. Note that the wavelengths and exposure times depend greatly on the 
filters present on the microscope, and should be optimized for each system 
(monochromators allow nm-scale changes in wavelength). A phase image is 
useful for determining cell cycle stage, and can be taken before or after 
acquisition of the fluorescence stack of images. 

In widefield microscopy, an entire field of cells is illuminated during 
exposure. The camera records both the in-focus and out-of-focus photons. 
While this creates a higher background than confocal microscopy, it allows 
more photons to be recorded by the camera, and these signals are used for 
image restoration algorithms. Deconvolution is particularly powerful when 
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applied to widefield imaging to reassign signal to the right plane. Several 
software packages propose deconvolution solutions, including Metamorph, 
Delta Vision, and Huygens. Similarly, denoising of the images (which 
removes optical and electronic noise from the digitalized images) can be 
applied to increase the signal-to-noise ratio. Although no commercial 
package is available to date, development of such denoising solutions is a 
very active field in image processing and could lead to significant reduction 
of both light intensity and illumination time in the near future. 

Live cell time-lapse imaging is used to record the dynamics of tagged 
chromatin or other subnuclear structures. Since repetitive illumination of 
the sample is involved, it is important to keep in mind that excitation light 
can stress the organism, and control experiments must be carried out to 
ensure that the level of illumination does not have deleterious cellular 
consequences. Parameters to optimize include image resolution (pixel 
size), the number of frames along the z-axis, excitation light intensity and 
exposure time. 

Widefield high-precision microscopy is useful for low-frequency time- 
lapse imaging over fairly long periods of time (hours). The excitation light 
from the monochromator or the LED should be filtered using gray filters to 
reduce phototoxicity. Limits are set by the intensity of light used, the 
number of planes acquired for each time point and the time between each 
acquisition. In our experience, up to 300 stacks of 5 sections (1500 frames, 
50 ms exposure per frame, 1 min interval between stacks) can be acquired 
without affecting cell cycle and with only moderate bleaching. Increasing 
sampling frequency will increase bleaching and damage the cells. Confocal 
or SD-systems are better choices for rapid time-lapse imaging, as acquisition 
speed is faster and photo-induced damage can be reduced by limiting the 
excitation time. 

2.5.2. Laser-scanning microscopy 

Laser-scanning systems are based on the rapid scanning of the sample by an 
excitation laser and recording of the emitted signal by photomultipliers 
(PMTs). The out-of-focus light is blocked by a pinhole which should be 
closed as far as possible. While these systems are well-suited to discriminate 
wavelengths and capture several at once, the scanning speed is often the 
limiting factor for image acquisition. Nonetheless, to track chromatin in 
individual cells at intervals of 1.5 s over timescales of 5— 10 min, commer- 
cially available systems such as the Zeiss LSM510 system are well suited. 
This system, although slower than the newer SDs (see below), is fast enough 
to track significant changes in chromatin movements (jumps >0.5 /im in 
10 s; Heun et ah, 2001b). Useful settings are described below (see also 
Neumann et ah, 2006). Note that pixel size is set by the user, and to track 
chromatin in vivo pixel size should be < 100 nm. A high-resolution piezo 
table is essential to achieve speed and reproducibility in z position: 
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Laser Argon/2 458, 488, or 514 tube current 4.7 A. 

Output 25% 
GFP acquisition Channel 1 LP 505 nm 

YFP/CFP Single track Channel 1 LP 530 nm 

Channel 3 BP 470-500 nm 
Channel settings Pinhole 1—1.2 Airy unit (optical slice 

700-900 nm); detector gain 930-999; 

amplifier gain 1—1.5; amplifier offset 0.2—0.1 V; 

laser transmission AOTF 0.1-1% for GFP 

excitation, 1-15% for YFP, and 10-50% for 

CFP single track acquisition 
Scan settings Speed 10 (0.88 /is/pixel), 8 bits one scan 

direction; 4 average line scans; zoom 1.8 (pixel 

size 100 X 100 nm) 
Imaging intervals 1.5 s 



2.5.3. Spinning-disk confocal microscopy 

As mentioned above, an attractive alternative to widefield and laser-scanning 
microscopes is the SD confocal. SD microscopes look similar to widefield 
systems yet the excitation light is provided by lasers, the beams of which are 
focussed on pinholes located on a disk rotating at high speed. Every point of 
the focal plane is illuminated several thousand times per second, but only for a 
fraction of a microsecond. The emitted light is filtered by passing through the 
pinholes to remove out-of-focus photons. Acquisition is achieved on a CCD 
camera, as for widefield systems. The overall quality of the picture is 
improved due to the confocality of the system: there is no haze as observed 
in widefield images. For example, nuclei which appear elongated along the z- 
axis in widefield stacks will appear more round using an SD confocal 
(Fig. 21. 3A and B). Moreover, due to the "intermittent" excitation of 
fluorophores by the SD, these systems show less bleaching and phototoxicity. 
This allows higher frequency sampling, at a rate that is generally limited only 
by the acquisition rate of the camera. Where a laser-scanning confocal can 
record only a single nucleus with five planes and a 0.45-/im z-spacing, with 
one stack every 1.5 s, an SD system is able to record 20 planes at 0.2 jam 
spacing every 1.5 s, on a whole field of view. 

Many systems are now available (Roper, Perkin Elmer, Andor, Zeiss 
provide full setups), all of which are based on Yokogawa scan heads. 
This head is the part which contains the SD itself, as well as filters for 
excitation/emission and the dichroic filter. As most of the light is stopped by 
the SD, powerful lasers (> 15 mW for 488 nm excitation line) have to be 
used, which increases the cost of such setups. The camera can be either 
a classical CCD system (see above) or a more sensitive (but more noisy) 
EM-CCD. 
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Figure 21.3 Comparison of microscope systems. (A, B) Projections in x-y, x—z, and y—z 
for yeast cells tagged with Nup49-GFP and a locus bearing a lacO array bound by a GFP- 
LacI fusion: (A) the image stack was taken with a high-resolution widefield microscope 
equipped with a monochromator and piezo (Tillvision®) , while in (B) the image was taken 
with a spinning disk confocal. The images were not further treated or deconvolved. (C) 
Zeiss LSM510 confocal image of yeast cells growing in an agarose pad, bearing the 
following markers and fusion proteins: Nuclear envelope (Nup49-GFP, white ellipse) 
and the spindle pole body (Spc42-CFP, lighter spot on the nuclear envelope, indicated by a 
white arrow), nucleolus (Nopl-CFP, gray internal crescent), and a tagged telomere, 
Tel5R::ZtfcO bound by GFP-LacI (gray arrow). Alongside are examples of time-lapse 2D 
confocal imaging on a Zeiss LSM 510 confocal microscope of two differently tagged 
telomeres relative to each other. They are displayed orthogonally and rotated such that the 
time axis (z) is horizontal. Top panel:Tel6L-TetR-YFP (lighter gray), Tel6R-CFP-LacI 
(darker gray); bottom panel (these two telomeres have been shown to colocalize (Schober 
et al, 2009)): Tel5L-TetR-YFP (lighter gray), Tel5R-CFP-LacI (darker gray). The green 
background staining of the nucleus is due to the TetR-YFP diffuse in the nuclear volume 
(reproduced with permission from Bystricky et al, 2005, see this paper for color images). 




3. Data Analysis and Quantitative 
Measurements 



3.1. Accurate determination of the 3D position 
of a tagged locus 

To determine the position of a tagged locus inside the nucleus, the position 
of the center of the nucleus and of the locus have to be reconstructed from 
the microscopic images. As described before, the locus is usually labeled by 
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Lad or TetR fused to a fluorescent protein. The outline of the nucleus can 
be determined either by labeling a component of the nuclear pore complex 
or by using the background fluorescence given by unbound repressor 
proteins filling the nuclear volume. The latter method allows reliable 
identification of the center of the nucleus, yet it is difficult to measure its 
exact size since background fluorescence fades at the boundary. Whenever 
the size of the nucleus or the exact location of the NE is required, 
nuclear pore staining is recommended, as the boundaries of the nucleus 
are sharper. 

The extraction of the shape of the NE and the position of a fluorescent 
spot from a stack of microscopic images has to deal with the anisotropy of 
the data, that is, the difference in optical resolution along the optical axis of 
the microscope (z-axis) and perpendicular to it (x/y-axes). One image (x/y- 
direction) has a typical optical resolution of 200 nm (with a 100 X objective) 
and is sampled with a pixel size of 50— 100 nm. In contrast, the resolution in 
z is not better than 300 nm even for a confocal microscope, and the images 
of a stack are typically taken at 200 nm steps. In addition, the fluorescent 
signal from the nuclear pores close to the top and bottom of the nucleus is 
diffuse and poorly resolved, impairing reconstruction of the NE. 

We discuss here two methods to measure the position of a spot relative to 
the NE. Ideally, one would want to directly measure the 3D distance between 
the nuclear rim and the tagged locus. A budding yeast nucleus can be accurately 
represented by an ellipsoid or even a sphere. One possibility is therefore to fit 
an ellipsoid to the nuclear pore staining and use it as a model for the NE. 
Analogously, a 3D Gaussian distribution can be fitted to the staining of the 
locus to determine its position with high accuracy. The distance between the 
locus and the NE (or the center of the nucleus) can then be calculated using the 
ellipsoid and the position of the spot. However, due to the limited microscopic 
resolution in the z-direction (~0.6 /im for green light in widefield and 
~0.45 /mi for a confocal), and the small size of the yeast nucleus, precise 
definition of the NE is particularly difficult within 0.4 /im of the top or bottom 
of the nuclear sphere. Attempts to solve this problem require custom-tailored 
multistep processing of highly sampled image stacks (Berger et al. , 2008) , and to 
date no standard software has been established. 

Once the position of the locus and of a second nuclear structure, such as 
the nucleolus or the spindle pole body, have been determined accurately, a 
more detailed analysis of nuclear organization can be performed based on 
determination of an axis within the nucleus. If only the distance of a locus to 
the nuclear center is measured, the nucleus is treated as spherically symmet- 
ric, which is, of course, not the case. Since the nucleolus and spindle 
pole body are located at opposite ends of the nucleus, they define an axis 
that can be exploited as a landmark for locus position. This allows one 
to score deviations of locus distribution from spherical symmetry (Berger 
etal, 2008). 
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To deal with the poor z resolution of microscopic stacks an alternative 
method exploits the fact that resolution is better in x—y and a spot can be 
assigned to a specific plane of an image stack. Instead of calculating the 3D 
distance between the spot and the spherical NE directly, one measures 
position in the plane where the spot is brightest. In this plane, the nucleus 
is a circle, which can be partitioned into three concentric zones of equal area 
(Fig. 21. 4B). The spot position is then sorted into the outermost (zone 1), 
the intermediate (zone 2), or the innermost zone (zone 3). To obtain equal 
areas for the three zones, the boundari es b etween zone s 1 and 2 and 
between zones 2 and 3 are at radii of \J 2/3R and \J 1/3R, respectively, 
where R is the radius of the nucleus in the chosen plane. Then it follows 
from the principle of Cavalieri that each zone represents one third of the 
nuclear volume, justifying the use of this approach. 

For practical applications, we use the following procedure: 

1 . Measure the distance between the spot and the periphery along a nuclear 
diameter as well as the diameter itself. Several programs can be used to 
extract the coordinates of points of interest from an image. For this task, 
the freely available pointpicker plug-in for ImageJ is particularly useful 
(http : //bigwww. epfl. ch/thevenaz/pointpicker/) . 

2. Normalize the spot pore distance to the radius (not diameter!) of the 
circle. 

3. Sort the spot into zone 1 (if the n orma lized distance is < 1 — y2/3), 
zone 2 ( if it is between 1 — y 2/3 and 1 — y 1/3), or zone 3 
(>1 - y/T/3). 

4. Compare the measured distribution to another one (different strain, 
condition, etc.) or to a uniform distribution using, for example, a 
X test. If only percentages of one zone (e.g., the outermost zone) are 
compared, a proportional test should be used. 

A locus whose position is uniformly distributed will be found with an 
equal probability of 1/3 in each of the three zones. It should be noted, 
however, that the three zones do not coincide exactly with three concentric 
shells of equal volume, which is the desired partitioning of the nucleus, if 
one wishes to assess whether a locus is enriched at the nuclear periphery 
(Fig. 21. 4D). We have calculated the error incurred by this method, and 
plotted it against the true distribution of spots in Fig. 21. 4E. Whereas the 
zone measurement is no longer precise when there is strong enrichment in 
any of the three zones, it accurately monitors a uniform distribution of spots. 
Moreover, the zone method consistently underestimates enrichment or 
depletion, which means that any measured enrichment in one zone did 
not arise from an artifact of the measurement method (Gehlen, 2009). 

As mentioned above, measuring spot position with respect to the NE is 
particularly difficult close to the poles of the nucleus. This is aggravated if 
the NE and spot are both tagged with GFP. To avoid severe errors that arise 
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Figure 21.4 Subnuclear localization relative to the nuclear envelope: the zoning 
method. (A) Fluorescence microscopy image of a yeast nucleus (one plane of a 3D 
stack of images) bearing GFP-Nup49, a component of the nuclear pore complex, and a 
lacO array integrated into the genome and bound by a LacI-GFP fusion (fluorescent 
spot). (B, C) For quantification, the ring representing the nuclear envelope in the plane 
where the spot is brightest is partitioned into three zones of equal area. The nuclear 
diameter in this plane (gray arrow) and the distance of the spot to the periphery (black 
arrow) are measured and the ratio, which defines the localization of the spot, is scored 
as falling into zone 1, 2, or 3. (D) Vertical cut through the nucleus. Three shells of equal 
volume are shown in shades of gray. The division of the nucleus into three zones based 
on equal area in each plane also results in three equal volumes (the boundaries are 
shown as black lines), but these do not coincide exactly with shells of equal volume. 
Because of lack of resolution in the top and bottom slices of an image stack (see text), 
we remove samples in which the locus falls into the upper or lower 20% of the nuclear 
sphere. This so-called "decapping" is indicated in darker gray. Removal does not affect 
the zones and shells equally. (E) The deviation from the actual distribution in each zone 
when foci are scored using the zoning method with no decapping. Without decapping, 
the shell measurement is exact and coincides with the solid line. (F) The deviation from 
actual distribution is shown for foci monitored by either the zoning method or the shell 
method, after removal of 0.4 fim from each pole. Both types of measurements deviate 
from the true enrichment, although the zoning method is most accurate for zone 1 . A 
fraction of one-third corresponds to a uniform distribution, 0.6 is a typical fraction, for 
example, for an anchored yeast telomere. 
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from such poorly resolved signals, we do not score cells in which the tagged 
locus is positioned within 0.4 /mi of the top or bottom of the nucleus. This 
so-called decapping can include 3—4 planes (up to 20% of the focal planes) 
from each pole. While it removes questionable signals, it also affects the 
distribution determined by both shell (ideal 3D distance measurement) and 
the zoning method, because peripheral spots are more likely to be discarded 
than interior ones (Fig. 21.4D). In Fig. 21.4F we plot the error incurred by 
zoning and shell measurements as a function of spot enrichment, under 
decapping conditions. Intriguingly, decapping by 20% actually improves the 
accuracy of the zone measurements, while the shell measurements suffer 
from removal of these planes. Our analysis shows that the shell measurement 
method performs best in cases of extreme enrichment or depletion while 
the zoning is more accurate for moderate enrichments (35—60%), particu- 
larly in the outermost zone (zone 1). In principle, it is possible to compen- 
sate for these errors but one needs to know the exact size of the caps 
removed. On a practical level, it is important to remember that the zoning 
method accurately scores both uniformly distributed loci and distributions 
close to a uniform, independently of the amount of decapping performed. 

3.2. Colocalization of a DNA locus with a subnuclear structure 

To further investigate the function of DNA position, it is interesting to 
know if a fluorescently tagged locus colocalizes with other structural com- 
ponents of the nucleus. This can be investigated by tagging the locus in one 
fluorophore and the structure of interest with another, and monitoring their 
colocalization. Correction for chromatic shift must be made for each 
instrument and imaging session, by alignment of signals from small beads 
that emit fluorescence at multiple wavelengths. 

Unless a locus is actively excluded from a subnuclear structure, a certain 
level of random colocalization will be detected. The amount of this back- 
ground overlap will depend on the size and form of the structures moni- 
tored. To assess whether experimentally obtained colocalization values are 
significant or not, one must determine the expected degree of non-specific 
colocalization for a uniformly distributed locus. This can be calculated as the 
ratio between the volume of the region in which the spot is considered as 
colocalizing with the structure, and the total volume available to the spot. 

As an example we take the binding of a chromatin locus (gene or 
telomere) to nuclear pores (Schober et ah, 2009). The diffraction limited 
resolution of a light microscope is not sufficient to distinguish the binding 
of a locus to nuclear pores from its binding to other components at the NE. 
A genetic trick to circumvent this problem is to examine a yeast strain with 
an N-terminal deletion of the nuclear pore component NUP133 
(nupl33AN; Schober et ah, 2009). In this mutant the pores are not 
distributed all over the NE, but are clustered on one side of the nucleus 
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(Fig. 2 1.5 A). A high degree of colocalization of a locus with the pore cluster 
may indicate specific affinity for a nuclear pore component. 

To determine the colocalization arising from a uniform distribution of the 
locus, we first model the pore cluster as a conical disk at the nuclear 
periphery, whose dimensions are set based on empirical measurements. 
The spot is considered to colocalize with the cluster if it at least touches it 
(Fig. 21.5B). For the center of the chromatin spot, this defines a region that is 
larger than the pore cluster, which represents spot and pore colocalization. 
The predicted degree of background coincidence is the ratio between this 
colocalization volume and the total volume that is available to the spot. In the 
calculation of nonspecific colocalization, one can include other parameters, 
such as an exclusion of the spot from a subnuclear volume like the nucleolus, 
or a nonpore-associated enrichment at the NE. The significance of any 
experimental enrichment in colocalization is then determined by a propor- 
tional analysis test with a Bonferroni multiple test component. 

3.3. Quantification of locus mobility 

A stretch of chromatin (or any other object) inside the nucleus is exposed to 
numerous hits of water or other small molecules, proteins, and other 
macromolecules, as well as other chromatin fibers. Due to these interac- 
tions, it inevitably performs a seemingly random movement called Brow- 
nian motion. This motion is limited by the NE, but in many cases locus 
diffusion is even more constrained, either confined to a certain area or 
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Figure 21.5 Determining the significance of colocalization. (A) Nuclear pores tagged 
with Nup49-GFP (red) and a Lacl-tagged locus (green). The two left images in the 
upper panel are not deconvolved, all other images are. In the nupl 33 AN mutant, the 
nuclear pores form a cluster (Schober et al, 2009). (B) The expected colocalization for a 
randomly positioned spot and the pore cluster can be calculated as a ratio of volumes 
(see text). The figure shows a cut through the nucleus. The pore cluster is modeled as a 
conical layer shown in red. The spot is considered as colocalizing if it at least touches the 
pore cluster, which results in the colocalization zone (green). 
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obstructed by obstacles. The random movement can also be temporarily or 
continuously superimposed by active displacement which possibly expresses 
itself as increased speed and/or directionality of movement. 

The first step of the quantitative analysis of chromatin movement is the 
determination of the position of the locus and the nuclear center for each 
time point of the time-lapse series. Indeed, since the nucleus itself is moving 
inside the cytoplasm, one must compensate for its displacement to measure 
the movement of a locus relative to the nucleus. Several general purpose 
software packages like Imaris (http://www.bitplane.com) offer object track- 
ing functionality but usually require uniformly high-contrast images. The 
algorithms are mostly based on threshold principles, and it is difficult to 
correct insufficient results by hand. In collaboration with D. Sage and 
M. Unser, a dynamic programming algorithm was developed which is 
dedicated to the tracking of single spots in noisy images and can be applied 
to 2D or 3D time-lapse movies (Sage et ah, 2005). The algorithm is 
implemented as a publicly available plug-in for the free software ImageJ 
(http://bigwww.epfl.ch/sage/soft/spottracker/). 

This tracking works in two steps: first, the images are aligned with 
respect to the center of the nucleus to compensate for the movement of 
the entire nucleus throughout the time-lapse series. A Mexican hat filter can 
be applied to enhance spot-like structures in the images. Next, the spot 
tracking is performed using three different properties of the spot: 

1. Spot intensity: the spot fluorescence is more intense than that of the 
background. 

2. Within one time step the spot can only travel a limited distance. 

3. In contrast to nuclear pores, the spot can be located in the nuclear 
interior. 

To reflect these properties the tracking algorithm uses four different 
criteria to determine the spot position at a given time point: 

1 . Pixel intensity 

2. Displacement from the location at the previous time point 

3. Displacement from the last user-defined position (see below) 

4. Distance from the nuclear center 

The user can give different weights to these criteria to optimize the 
performance of the algorithm for different situations or image qualities. Most 
importantly, the plug-in offers the possibility to correct the trajectory manually 
by forcing it to pass through a given pixel at a certain time point. The output of 
the plug-in is the position of spot and nuclear center for each time point. 

Because of individual differences between cells it is inevitable to analyze 
at least 8—10 movies with a total time of more than 40 min for each strain or 
condition. We discuss three parameters that can be extracted from the 
trajectories to compare different samples. 
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3.3.1. Track length 

A simple and robust parameter of chromatin dynamics is the track length 
over a time-lapse series of fixed duration. This parameter monitors average 
mobility of a locus and can be used for comparison of movies with the same 
time step and duration. It is, however, a very artificial parameter because the 
true trajectory of the spot is inaccessible due to the lack of temporal and 
spatial resolution and is much longer than the measured track length (see 
Fig. 2 1.6 A for illustration). 

3.3.2. Step size and large steps 

The average step size of the chromatin locus within its "walk" is another 
useful characteristic. Like the track length, this parameter depends on the 
time step used for image acquisition, but can be used to compare differences 
in mobility in identically imaged samples. Directed movement does not 
necessarily reveal itself in large single steps but rather in several successive 
correlated steps. Therefore, it is also useful to look for exceptionally high 
displacements ("large steps") within a certain time window. Empirically we 
find that a useful parameter for distinguishing patterns of mobility is the 
frequency of steps larger than 500 nm during 10.5 s (7 X 1.5 s steps; Heun 
etal, 2001b). 

3.3.3. Mean-squared displacement analysis 

A robust method to analyze the global properties of an object's movement is 
the mean-squared displacement (MSD) analysis. An object in solution 
changes its direction when it bumps into solvent molecules and moves 
linearly in between, generating a random walk. If a number of objects 
would be initially confined in a small volume and then released, they 
would spread over time. It can be derived mathematically that for free 
diffusion the mean of the squared distance from one point on the trajectory 
to another is proportional to the time difference At: ((r(t -\- At) — r(t)) ) 
~ At, where t(t) is the position of the object at time t (Berg, 1993). The 
proportionality constant is usually written as 2dD where d is the number of 
dimensions and D is called the diffusion coefficient of the object. Thus, for 
three-dimensional free diffusion we get ((r(t -\- At) — r(t)) ) = 6Dt 
(Fig. 21.6). 

However, in a cellular environment there is no free diffusion. The free 
movement of an object can be impaired by confinement, obstacles, and the 
binding to immobile or actively moving structures. The most inevitable 
restriction is the confinement of the object's movement to a nuclear or 
cellular compartment. This implies that the distance of any two points of the 
trajectory cannot exceed the maximal extension of the confining volume. 
Therefore, the MSD curve has to reach a plateau for large time windows 
(Fig. 21.6B). In the case of a spherical confinement, the value of the plateau 
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Figure 21.6 Mean-squared displacement (MSD) analysis. (A) The full trajectory of 
microscopic movement (light gray) cannot be detected by fluorescence microscopy due 
to limited resolution in time and space. A coarser trajectory (black) is recorded instead. 
(B) The mean of all squared spatial distances between each two points at a given time 
difference results in one point on the MSD graph. The mean-squared distance between 
a point and its successor on the trajectory is the first point on the MSD graph (black). 
The mean-squared distance between a point and its second successor yields the second 
point (dark gray) and so on. Compare the gray tones of the example distances in (A) 
with those of the points on the MSD graph in (B). (C) Analysis of DNA locus dynamics. 
The projected trace of 200 images of a movie of the LYS2 locus is in white. The average 
track length in 5 min is 37.4 /im. Bar: 1 /mi. (D) MSD analysis on an average of eight 
movies of the LYS2 locus. All cells were observed in Gl phase. (E) The mean-squared 
change of spot-spot distance. In contrast to a classical MSD analysis, the mean-squared 
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can be calculated as 6/5R , where R is the radius of the sphere (Neumann 
et ah, submitted). Thus, the so-called radius of constraint or the plateau 
value can be directly used as a measure for the size of the region explored by 
the object. 

Due to the difficulties in accurately reconstructing the 3D position of 
fluorescent spots (see above), the movement is often observed in a 2D 
projection of the microscopic stacks. It can be calculated that the MSD of 
projected free 3D diffusion is equal to the MSD of free 2D diffusion: 
((t(t + At) — r(t)) ) = 4Dt. In the case of a spherical confinement, the 
MSD plateau behaves in the same way and has a value of 4/5R (see 
Fig. 21.6C and D; Neumann et ah, submitted). 

For free diffusion the slope of the MSD line is a measure for the diffusion 
coefficient of the object, as discussed above. In the case of confined diffu- 
sion, the slope of the MSD curve is not constant. The curve is steepest at 
At = 0, and then the slope decreases monotonously (Fig. 21. 6B). This is 
also true for diffusion with obstacles where — in the unconfined case — the 
MSD is not proportional to t but to t a with a^ 1 (reviewed in Bouchaud 
and Georges, 1990). Nonetheless, one can still use the initial slope of the 
curve to compare the intrinsic mobility of different objects or one object 
under different conditions. 

It should be noted that the movement of a locus relative to the nucleus is 
superimposed by the movement of the nucleus itself. Translational move- 
ment of the nucleus can be subtracted from locus movement by aligning the 
nuclear center throughout the time course (see Section 2.8). If two spots are 
observed, there is the alternative possibility to align one of the spots 
throughout the movie and analyze the movement of the other spot relative 
to the first one. This procedure also eliminates the global movement of the 
nucleus. However, neither the alignment of the nuclear center nor the 
alignment of one spot eliminates the rotational movement of the nucleus. 
A possibility to obtain a quantification of locus mobility that is independent 
of nuclear rotation is to observe the distance between the two loci and 
calculate the mean-squared change of this distance (see Fig. 21.6E). 
Since the distance between the two spots is unaffected by both translation 
and rotation of the nucleus, this "distance MSD" is only influenced by the 
individual movement of the two spots. The distance MSD curve shows 



change of the distance between two spots instead of the mean-squared change of the 
position of one spot is analyzed. (F) The plateau of the distance MSD curve does not 
only depend on the radius of confinement R of the loci, but also on the distance d 
between the confining regions. However, this dependency becomes very weak for 
d > 3R. Therefore, the radius of confinement can be reconstructed from the distance 
MSD plateau only if the confining regions are either equal (d = 0) or sufficiently far 
from each other (d > 3R). 
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similar behavior to a classical MSD curve and has been used to derive 
diffusion coefficients and radii of constraints (Marshall et ah, 1997). How- 
ever, it is important to note that the authors assumed that both loci are 
confined to the same region. If this is not the case, the height of the plateau, 
as well as the initial slope of the curve, does not only depend on the mobility 
of the loci but also on the distance separating the regions of constraint for 
the two spots. The distance MSD analysis is a valid technique to determine 
radius of constraint and diffusion coefficient for two diffusing spots if one of 
the two following conditions is fulfilled. Either (a) one can assume that the 
confining regions are identical (e.g., the whole nucleus) or (b) they are 
sufficiently far from each other. Three times the radius of constraint was 
found to be a reasonable threshold (Gehlen, 2009). 




4. IF and FISH on Fixed Samples 

Despite the power of live imaging of GFP-tagged foci, the more 
classical techniques of IF and FISH are recommended in several cases. 
First, for a scientist working alone, the analysis of multiple samples at one 
time point is cumbersome by live imaging. Second, if more than two DNA 
loci need to be imaged at once, or multiple foci in one background, FISH is 
more efficient. Finally, these techniques allow colocalization of protein, 
specific genes, and either genomic DNA or cellular substructures such as the 
spindle. The combination of three or four fluorochromes in a single labeling 
experiment is routine. Nonetheless, there are pitfalls in applying this to 
yeast. First, antibody background and nonspecific fluorescence signal is 
more often observed with yeast cells than with mammalian cells. Second, 
one must preserve native 3D structures of both nuclear and cytoplasmic 
compartments, while eliminating the cell wall to facilitate macromolecular 
access. To check that this was done, the integrity of an NE and the size of 
the nucleus can be monitored either by DNA stains or by immunolabeling 
with an antibody recognizing the nuclear pore (e.g., Mab414 (Abeam) 
which recognizes yeast Nspl and yields a perinuclear ring). The spherical 
ring structure is lost when spheroplasting conditions are too harsh or if 
detergent use is too high. 

The diameter of an intact haploid yeast nucleus should measure between 
1.8 and 2 /im, and this measurement should be monitored regularly to 
ensure that the nuclei observed are intact. Inappropriate methods produce 
flattened nuclei with a chromatin mass spanning ~6— 8 /im (Heun et ah, 
2001a; Weiner and Kleckner, 1994). Due to the nature of in situ hybridiza- 
tion, accessibility of the DNA probe to the nuclear chromatin is critical and 
FISH protocols seek the best possible compromise between accessibility of 
the probe and complete integrity of nuclear and chromatin structure. 
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To this end, we eliminate treatments that involve protease, nuclease, and/or 
combinations of ionic and nonionic detergent from our protocol. We find 
that yeast nuclei collapse when enzymatically digested or if exposed to 
detergent mixtures (Gotta et ah, 1996; Hediger et ah, 2002; Heun et ah, 
2001a). Generally, formaldehyde fixation should be performed prior to the 
enzymatic removal of the cell wall (spheroplasting) . However, if maximal 
diffusion of fixative or probes is critical, spheroplasting in osmotically 
buffered medium can be performed prior to fixation. Double in situ/'mmm- 
nofluorescence staining often requires this type of fixation. Finally, even 
though cells and spheroplasts are fixed, we recommend imaging in agarose 
pads, since pressure on coverslips can distort 3D structure. Confocal micros- 
copy confirms that 3D organization can be maintained by the following 
procedure (Heun et ah, 2001a). 

4.1. Yeast strains and media 

Diploids yeast strains may facilitate the microscopic localization of chromo- 
somal loci, since the nuclei are nearly twice the size of haploid nuclei. There 
is a significant variation in the efficiency with which different strains are 
converted to spheroplasts, probably reflecting differences in the cell wall 
composition. Diploid strains usually spheroplast faster. Whenever mutants 
and wild type are compared we recommend using isogenic strains or strains 
with similar genetic background to avoid differences in the digestion time. 
Moreover, the efficiency of spheroplasting can be affected by growth con- 
ditions, that is, carbon source, rate of growth and stage of growth at the time 
of harvest. Best results are obtained with cells grown on rich medium 
(YPD) (Rose et ah, 1990) and harvested in early to mid-logarithmic phase 
(0.5—1 X 10 cells/ml). When a strain background is used for the first time, 
it is useful to do a titration of the spheroplasting enzymes. 

4.2. Antibody purification and specificity 

Polyclonal antibodies can be an advantage for IF because they can recognize 
multiple epitopes. However, rabbit sera very often have strong background 
reactivity with a variety of yeast proteins, besides the desired antigen. This 
can be avoided in two ways: affinity purification of the specific antibodies 
or depletion of nonspecific antibodies by incubation with yeast deleted for 
the gene encoding the antigen. Affinity purification against recombinant 
antigen is performed as follows: 

1. Transfer by Western blotting at least 50 fig of recombinant antigen to a 
nitrocellulose filter. 

2. After staining with Ponceau red (0.05% in 3%TCA), cut out the strip 
containing the protein band. Wash the nitrocellulose strip 3x 10 min in 
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lx TEN (20 vaM Tris-Cl, pH 7.5, 1 vaM EDTA, 140 vaM NaCl), 
0.05% Tween 20. Block excess protein binding sites by incubating in 
1 X TEN + 0.05% Tween 20 + 1% dry milk powder, at room tempera- 
ture for 20 min. 

3 . Incubate the strip with 10—50 jA of serum (depending on antibody titer and 
amount of antigen loaded) in 1 ml of 1 X TEN, 0.05% Tween 20, 1% dry 
milk powder, overnight at 4 °C with constant agitation (rocker or wheel). 

4. Remove the supernatant, wash the strip 3x10 min in 1 X TEN, 0.05% 
Tween 20 at room temperature. Elute the bound immunoglobulin with 
300 iA of cold 0.1 M glycine, pH 3.0, for 2 min. 

5. Immediately raise the pH to 7.0 by adding 1 M Tris base (the volume 
required should be determined empirically before starting), and place on ice. 

6. Repeat the elution once or twice and pool the elutions that contain 
antibody. Note that it may be necessary to drop the pH of the glycine to 
pH 1.9 for efficient elution. 

7. The antibodies can be stored as aliquots at —80 °C. Stabilization is 
enhanced by addition of 1—2% ovalbumin and 20% glycerol. The anti- 
body is used at a dilution of 1:2 or more for IF. The specificity of the 
purified antibodies should be demonstrated by Western blot and IF on 
strains lacking the protein in question. 

If recombinant antigen is not available, rabbit sera can be preadsorbed 
against fixed yeast spheroplasts from a strain lacking the desired antigen. 
Incubation of antiserum and cells can be performed for several hours, and 
the nonbound antibodies are used on the test sample after sedimentation of 
the fixed spheroplasts. 

Monoclonal antibodies usually recognize a single epitope which reduces 
background in yeast, yet some commonly used monoclonals (e.g., anti-HA) do 
recognize an endogenous yeast protein epitope. This can be tested on Western 
blots, although SDS denatured antigens are not always equivalent to formalde- 
hyde fixed ones. The obvious disadvantage of staining for a unique epitope is 
that the risk is greater that it is masked or denatured by the fixation conditions. 

The fluorophore-coupled secondary antibodies should always be tested 
on permeabilized material lacking the primary antibody to assess the back- 
ground fluorescence created by unspecific binding of the secondary anti- 
bodies. To improve signal specificity it is advisable to preabsorb the 
secondary antibody on fixed yeast cells, and to dilute it maximally to 
avoid unnecessary background. 



4.3. Choice of fluorophores 

For efficient visualization of several targets, fluorophores should be chosen 
that are excited and visualized independently. This depends on the excitation 
lines and filter sets available in your microscope. If there is overlap between 
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the emission spectra, we recommend attenuating some signals by controlling 
the intensity of the excitation line (e.g., on a confocal microscope) to avoid 
"bleed through." Some of the more commonly used fluorophores are Alexa 
Fluor conjugated antibodies at several excitation/emission wavelengths 
(Molecular Probes, Invitrogen), Cy3 (A = 554 nm, E = 566 nm) and 
Cy5 (A = 649 nm, E = 666 nm). The Alexa fluorophores offer the advan- 
tage of increased photostability, as compared to the older Cy dyes. 



4.4. Protocol 

We present here one protocol for combined IF/FISH, but the same proce- 
dure can be used to perform only IF by omitting Sections E and F, or only 
FISH by omitting Section D. Follow all Sections A— G for combined 
IF/FISH. 

(A) Fixation 

Cells are fixed either before or after conversion to spheroplasts by the 
addition of freshly dissolved paraformaldehyde (not glutaraldehyde) . If 
preservation of cell shape and cytosolic structures is required, then cells 
should be fixed before spheroplasting. For detection of low abundance 
nuclear antigens, postspheroplasting fixation can be used. A fresh stock 
solution of 20% paraformaldehyde should be prepared before the 
experiment begins by mixing 5 g of paraformaldehyde, 15 ml H 2 
and 25 fA 10 N NaOH. Dissolve at 70 °C in a closed bottle in a fume 
hood for about 30 min with occasional shaking. Adjust final volume to 
25 ml and cool on ice. Note that paraformaldehyde fumes are toxic and 
care should be taken with this reagent. The commercially available 37% 
formaldehyde solution, while less toxic, has long formaldehyde polymers 
that hinder entry into cells. Glutaraldehyde should be avoided since it 
often masks or destroys antigenic epitopes. 

V — 

1. Grow yeast cells overnight to about 1 X 10 cells/ml in 50 ml 
YPD or selective media (Rose et al., 1990). 

2. Adjust to 4% paraformaldehyde (final concentration) and incubate 
15 min at room temperature. For a 20-ml culture one would add 
5 ml of 20% paraformaldehyde. If fixation is performed in syn- 
thetic medium, the fixative should be quenched by adjusting to 
0.25 M glycine or 0.1 MTris-Cl, pH 8.0, after 15 min. 

3. Centrifuge 5 min at 800 x^. 

4. Carefully resuspend the pellet in 40 ml of YPD and centrifuge 
3 min at 800 Xg. 

5. Repeat step 4. 

Resuspend pellet in YPD (1/10 of initial culture volume) and keep 
it at 4 °C (up to overnight) or proceed to spheroplasting using the 
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protocol below. If epitopes are of low abundance, it may be preferred 
to spheroplast prior to fixation. In this case start with Section B. 
(B) Spheroplasting 

6. Harvest cells at 1200 x^ for 5 min at room temperature in 
pre weighed 50 ml polypropylene tubes. 

7. Decant the supernatant and weigh the cell pellet. 

8. Resuspend the cells in 1 ml/0.1 g of cells 0.1 M EDTA-KOH 
(pH 8.0), 10 mMDTT. DTT has to be added freshly. Use roughly 
1/20 culture volume of EDTA— DTT solution. 

9. Incubate at 30 °C for 10 min with gentle agitation. 

10. Collect the cells by centrifugation at 800 x^ for 5 min at room 
temperature. 

1 1 . Carefully resuspend the cell pellet in 1 ml/0. 1 g cells YPD + 1.2M 
sorbitol (mix 22 g sorbitol with 100 ml YPD). To resuspend 
evenly, suspend the cell pellet first in 500 [A. 

12. Add lyticase (/}- glue anas e; Verdier et aL, 1990) to 250—500 U/ml 
and predissolved Zymolyase (20T, Seikagaku) to final 10—100 fig/ 
ml. This step is critical; appropriate amounts of enzyme should be 
determined in a trial experiment with the same cells. 

For a 20-ml culture we use 2 ml of solution with 12 fA lyticase 
(40,000 U/ml) and 4 jul of Zymolyase (20T) freshly dissolved in YPD 
at 5 mg/ml. Because diploid strains spheroplast faster than haploid 
strains, we often pretreat with only 1 mM DTT and use half of the 
final concentration of lyticase and Zymolyase for diploid cells. 

13. Incubate at 30 °C in the original Erlenmeyer flask with gentle 
agitation (150 rpm) and monitor spheroplast formation in the 
microscope at 5, 10, 15, and 20 min. 

The appropriate stage of spheroplasting is determined by micro- 
scopic observation with polarized light. Initially cells will have a bright 
interior and a bright halo. Well spheroplasted cells become dark with a 
bright halo around the cell shape. When cells are dark inside and do not 
have the bright halo outside anymore, spheroplasting has been carried 
out for too long. This leads to a loss of antigen by diffusion and an 
altered 3D structure of the cell. In a given culture, speed of sphero- 
plasting varies among cell stages, thus it is therefore advisable to stop 
digestion when 50% of the cells are properly spheroplasted. 

14. Dilute with YPD + 1.2 M sorbitol to 40 ml. Centrifuge 5 min at 
800 Xg. 

15. Wash twice in 40 ml YPD + 1.2 M sorbitol, resuspending gently 
using a rubber bulb on the end of pipette (do not vortex or use 
glass rods). Centrifuge 5 min at 800 Xg. 

If cells were not fixed prior to spheroplasting, resuspend the spher- 
oplasts gently in 0.5 X culture volume of YPD + 1.2 M sorbitol, and fix 
by incubating at room temperature in 4% paraformaldehyde (final 
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concentration) for 15 min. All washes should be done with YPD + 
1.2 M sorbitol to avoid cell lysis. 

(C) Cell permeabilization 

16. Resuspend fixed spheroplasts thoroughly in YPD + 1.2M sorbitol 
(0.5 g in 0.8 ml). Sorbitol can be omitted for cells that were fixed 
first prior to spheroplasting. The concentration of cells in this 
suspension should be such that only one layer of nonconfluent 
cells will adhere to the slide. Leave a drop on each spot of Teflon- 
coated slides (Super-Teflon slides, Milian) for 1—2 min to allow 
adherence, and remove as much liquid as possible using a pipet. 
Superficially air dry 2 min. All the following washes are performed 
by immersing the slide in a Coplin jar containing the indicated 
solution. 

17. Place the slides in prechilled methanol at —20 °C for 6 min. 

18. Transfer the slides to prechilled acetone at —20 °C for 30 s. 

19. Air dry 3 min. 

(D) Antibody treatment (IF) 

20. Incubate slides in 1 X PBS (Sambrook et ah, 1989) + 1% ovalbu- 
min + 0.1% Triton X-100 for 20 min or more. Shake gently two 
or three times at room temperature. After this step the cells appear 
transparent and nuclei can be seen as a dark spot. This is an 
indication of good spheroplasting. If this is not the case, it may 
help to leave the slides for up to an hour in PBS + 1% ovalbumin 
+ 0.1% Triton X-100. 

21. Dry the Teflon surfaces and bottom of the slides with a paper 
tissue. 

22. Cover each spot on the slide with 25 jA of the appropriate primary 
antibody diluted as required in PBS containing 0.1% Triton 
X-100. For affinity purified antibodies start with a 1/20 dilution 
in 0.5 X PBS + 0.1% Triton X-100 to avoid high salt concentra- 
tions. For overnight incubation Triton should be avoided. 

23. Incubate for 1 h at 37 °C in a humid chamber or overnight at 
4 °C. In the latter case the slides should be covered with a 
coverslip (but not sealed) to avoid drying of the antibody solution. 

24. Preabsorb the secondary antibody on yeast cells. For this purpose, 
use the remaining fixed spheroplasts by washing them 3 X in PBS 
containing 0.1% Triton X-100 and resuspending them in 1 ml of 
PBS. Dilute the secondary antibody (stock is usually 1 mg/ml) 
1:250 in this spheroplast suspension and incubate for 30 min on a 
rotating wheel at 4 °C in the dark. Centrifuge at top speed. Store 
on ice until needed. 

25. After the primary antibody incubation, wash the slides 3x5 min 
by immersion in PBS + 0.1% Triton X-100 in a Coplin jar at 
room temperature. 
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26. Dry the Teflon surfaces and bottom of the slides. Cover each slide 
with 25 /il/spot of the fluorescent secondary antibody after pre- 
adsorption and incubate for 1 h either at room temperature or 
37 °C in a dark, humid chamber. 

27. After the secondary antibody, wash the slides 3x5 min in PBS + 
0.1% Triton at room temperature. 

(E) In situ hybridization probes 

To label probes for FISH, plasmids containing the target sequence can 
be used as well as PCR fragments amplified using appropriate primers. 
For optimal FISH signals a fragment of 6— 10 kb from a genomic locus 
should be used as a template for nick-translation to prepare probes. 
Fragments as small as 2 kb can be used, although labeling efficiency will 
be lower. Final probe length should be between 200 and 300 nucleo- 
tides after nick- translation. This can be checked by running the final 
probe on a 2% agarose gel. Probes for FISH are labeled by a nick- 
translation protocol for which kits are commercially available (e.g., 
Nick Translation Mix, Roche). The fluorescent labeling can be carried 
out either during the nick-translation reaction or indirectly using an 
antibody against modified nucleotides. Detailed protocols for probe 
preparation have been published previously (Gotta et ah, 1999; Heun 
etal, 2001a). 

Direct labeling of the probe is achieved by using a fluorescently 
labeled dUTP (Alexa fluor dUTP, Invitrogen) in place of dTTP in 
the nick-translation reaction. Efficiency of the Alexa-dUTP incor- 
poration into the probe can be quantified using a Fluorimeter 
(NanoDrop), or the fluorescence in the dried probe pellet can be 
directly visualized under a fluorescent microscope. Alternatively, 
commercially available kits offer a two-step labeling using amine- 
modified dUTP, which will then be cross-linked to fluorochromes 
(FISH-Tag, Invitrogen). Since the amine modification is small com- 
pared to Alexa molecules, the nick-translation reaction is more effi- 
cient and the resulting probe is brighter. Finally, probe labeling can 
also be achieved using digoxigenin-derivatized dUTP (dig-dUTP, 
Roche). Note that the detection of the digoxigenin-derivatized 
dUTP will require an antidigoxigenin fluorescent primary antibody 
or an antidigoxigenin primary antibody and a fluorescently labeled 
secondary antibody. This approach can be used to amplify weak 
signals. 

(F) FISH 

If only FISH is to be performed go directly from step 19 (cell permea- 
bilization) to step 30. For a combined IF/FISH protocol continue here 
with step 28, which prevents primary or secondary antibody dissocia- 
tion under the harsh conditions used for FISH 
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28. Postfix the cells in 4x SSC, 4% paraformaldehyde 20 min at room 
temperature after the last wash. Rinse 3x3 min in 4x SSC. 

29. Immerse cells in 4x SSC, 0.1% Tween 20, 20 /ig/ml preboiled 
RNaseA (optional). Incubate overnight at room temperature (in 
the dark if IF was performed). 

30. WashinH 2 0. 

31. Dehydrate in ethanol: 70%, 80%, 90%, and 100% consecutively at 
— 20 °C, 1 min each bath. 

32. Air dry. 

33. Add 10 /il/spot of 2x SSC, 70% formamide. Cover with a cover- 
slip. Leave 5 min at 72 °C (place the slide on top of an aluminum 
block which is partially submerged in a 72 °C waterbath. On the 
narrow edges of the slide, place few drops of water, which will 
spread between the aluminum block and the slide, improving the 
heat conductance). 

34. Dehydrate in ethanol: 70%, 80%, 90%, and 100% consecutively at 
-20 °C, 1 min/bath. 

35. Air dry. 

36. Apply hybridization solution, 3 jA for each spot. The optimal 
concentration of probe depends on the sequence and must be 
determined empirically. Place a coverslip on top avoiding air 
bubbles, seal with nail polish. 

37. Incubate 10 min at 72 °C. 

38. Incubate 24-60 h at 37 °C. 

39. Remove the coverslip and wash twice in 0.05 X SSC, 5 min at 
40 °C. 

40. Incubate in BT buffer (0.15 M NaHC0 3 , 0.1% Tween 20, pH 
7.5) 0.05% BSA, 2x 30 min at 37 °C in the dark. 

41. If FISH probe was labeled using digoxigenin-derivatized dUTP 
continue with step 42. 

If probe was done using a fluorescent dUTP go directly to visuali- 
zation or stain DNA by following step 45. 

42. Add mouse antidigoxigenin diluted 1:50 in BT buffer without 
BSA + the secondary goat— anti-mouse or rabbit antibody 1:50 (for 
refreshing the IF signal, if necessary; Boehringer Mannheim). 
Stock solutions are usually 1 mg/ml. At this point you can either 
use derivatized sheep anti-Dig (rhodamine or FITC derivatized) or 
detect the protein two steps, first with a nonderivatized primary 
mouse— anti-DIG, and then with a secondary fluorescent antibody 
to amplify the anti-DIG signal. For two-step labeling, repeat steps 
42-44 twice. 

43. Incubate 1 h at 37 °C in a humid chamber. 

44. Wash 5x3 min in BT buffer. 
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(G) DNA visualization 

To visualize DNA, you must avoid the wavelengths of excitation and 
emission relevant for the fluorophores used. The most frequent stain- 
ing agents used are ethidium bromide (diluted to 1 /ig/ml in antifade 
reagent excitation 518 nm/emission 605 nm), DAPI (1 fig/ ml, excita- 
tion 358 nm/emission 461 nm), or cyanine nucleic acid dyes (TOTO/ 
POPO/YOYO/BOBO family of dyes, Molecular Probes). 

45. Add 25 /il/spot of the DNA stain agent diluted in 1 X PBS + 0.1% 
Triton X-100 for lOmin at room temperature. 

46. Wash in lx PBS + 0.1% Triton X-100. 

47. Dry the black Teflon surface and bottom of the slides and add one 
drop of antifade solution (Prolong antifade, Invitrogen). An alter- 
native antifade is 1 X PBS, 50% glycerol, 24 fig diazabicyclo- 
2,2,2-octane or DABCO, pH 7.5. 

48. Cover with a coverslip avoiding air bubbles. Slides can be exam- 
ined immediately or kept at 4 °C in the dark overnight. For longer 
storage, seal the coverslip with nail polish and keep at 4 °C in the 
dark or at —80 °C. 



4.5. Special notes 

To monitor different targets at a time, primary antibodies from different species 
must be used (e.g., mouse, rabbit, sheep) and species-specific secondary anti- 
bodies. To reduce incubation times, we recommend mixing primary or 
secondary antibodies. However, it is essential to pretest the secondary antibody 
with each of the primary antibodies separately to ensure that they do not cross 
react. 

An alternative way to localize proteins both in living and in fixed cells is to 
generate a GFP fusion (Shaw et ah , 1 997), although proteins fused to GFP must 
be tested for proper functionality. When the GFP-fluorescence signal is very 
strong (abundant or overexpressed proteins), it can sometimes be visualized 
after the IF protocol. For weaker signals, or CFP fusions, samples should be 
fixed with 1% paraformaldehyde for 3 min and washed at least three times with 
1 X PBS. These samples need to be visualized by microscopy as quickly as 
possible. Epifluorescence (particularly for CFP) will not last long than a week at 
4 ° C. For visualization of a strong GFP signal cells can also be fixed with ethanol 
80% for 5 min and washed with 1 X PBS containing DAPI. Alternatively GFP 
fusions can be detected using the IF protocol and anti-GFP antibodies. 

It is not always necessary to preserve 3D nuclear structure, for example, 
for scoring mitotic or meiotic chromosome pairing (Guacci et ah, 1994; 
Weiner and Kleckner, 1994). However, one must be careful not to draw 
conclusions about nuclear architecture from results obtained with flattened 
or spread preparations. 
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For time-course experiments, or when a large number of samples need 
to be handled (20 or more), we recommend to fixing overnight at 4 °C, and 
performing spheroplasting the next day. For unexplained reasons, one uses 
half the amount of lyticase and Zymolyase under these circumstances. 
Moreover, spheroplasts can be spotted on glass, permeabilized and kept at 
4°C in blocking solution without Triton X-100 for extended periods of 
time. Prolonged exposure to Triton X-100 should be avoided. Some 
protocols recommend coating slides with poly-lysine (Sigma, P8920) to 
promote spheroplast or cell attachment, but we avoid it because it increases 
background fluorescence. Plastic multiwell slides (/i-Slide, Ibidi) can be 
used to spot multiple samples on one slide, reducing the number of slides 
needed. 

This protocol is not only useful for 5. cerevisiae, but has also been 
successfully used for Neurospora crassa. We used Novozyme 234 (Novo 
Biolabs) instead of Zymolyase to digest the Neurospora cell wall, and incu- 
bation times with the antibodies tested were increased to 48 h. 
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Abstract 

DNA within the yeast nucleus is spatially organized. Yeast telomeres cluster 
together at the nuclear periphery, centromeres cluster together near the spin- 
dle pole body, and both the rDNA repeats and tRNA genes cluster within the 
nucleolus. Furthermore, the localization of individual genes to subnuclear 
compartments can change with changes in transcriptional status. As such, 
yeast researchers interested in understanding nuclear events may need to 
determine the subnuclear localization of parts of the genome. This chapter 
describes a straightforward quantitative approach using immunofluorescence 
and confocal microscopy to localize chromosomal loci with respect to well 
characterized nuclear landmarks. 



Chromosomes within the yeast nucleus are spatially organized. Parts of 
chromosomes are associated with different subnuclear compartments such as 
the nucleolus, the nuclear envelope or the spindle pole body (Berger et ah, 
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2008; Dorn et al., 2007; Hueun et al., 2001; Jin et al., 2000). Localization to 
these sub nuclear organelles is not static. Chromosomal DNA is in constant 
motion and exhibits varying degrees of constrained diffusion (Cabal et al. , 
2006; Casolari et al., 2004; Chuang et al., 2006; Schmid et al., 2006; 
Shav-Tal et al., 2004). Sophisticated techniques have been developed to 
understand the dynamics of chromosomal elements in living cells (Chapter 
21). However, to determine where a gene localizes within the nucleus and 
how its localization is affected by inputs of interest (such as transcriptional 
activation), simpler methods can be used. Here, we describe a quantitative 
method for localizing chromosomal loci with respect to subnuclear land- 
marks using established immunofluorescence methods. 




1. Yeast Strain Construction 

This protocol is based on binding of the lac repressor from Escherichia 
coli to an array of lac repressor-binding sites integrated near the chromo- 
somal locus of interest (Fig. 22. 1A; Robinett et al., 1996; Straight et al., 
1996). Similar experiments can be carried out using the Tet repressor array 
(Abruzzi et al., 2006; Cabal et al., 2006; Chekanova et al., 2008; Dundr et al., 
2007; Fischer et al., 2004; Kohler et al., 2008; Kumaran ad Spector, 2008). 
Depending on the sensitivity of the microscope being used, this method 
requires >100 binding sites. In our experiments, we readily visualize an 
array of ~ 128 lac repressor-binding sites (lac operators/LacO array). To 
introduce this array at a locus of interest, the LacO array is first cloned into a 
plasmid with an appropriate selective marker (Fig. 22.1 A). The LacO array 
was originally cloned in plasmid pAFS52, an integrating TRP1 -marked 
plasmid (Brickner and Walter, 2004; Straight et al., 1996). We have also 
moved the LacO array (as a HinDlll—Xhol fragment) into pRS306 (URA3- 
marked integrating plasmid; Sikorski and Heiter, 1989) to create 
p6Lac0128 (Brickner and Walter, 2004). Whenever cloning the LacO 
array, it is important to confirm that the array is the expected size (> 5 kb) 
by restriction digestion, as the array is sometimes lost or reduced in size after 
propagation in E. coli. 

The next step is to introduce a targeting sequence into this plasmid so 
that the LacO array can be integrated at a locus of interest by homologous 
recombination. Because the sequences in the plasmid will be duplicated 
upon recombination (Fig. 22.1 A), it is important to consider carefully 
which targeting sequences to clone. When localizing genes, we usually 
use sequences at the 3 7 end of the gene to introduce the LacO array down- 
stream of the 3'UTR and to avoid duplicating the promoter. Also, to allow 
for homologous recombination, the targeting sequence must include a 
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Figure 22.1 Schematic of the chromatin localization assay. (A) Strategy for marking a 
locus of interest with GFP. A plasmid containing the lac repressor array (LacO) and both a 
marker and a targeting sequence is digested at a unique restriction site within the 
targeting sequence. Transformation and homologous recombination introduce the 
marker and the lac repressor array into the yeast genome, flanked by the targeting 
sequence. The strain into which the plasmid is transformed also expresses the Lac 
repressor fused to GFP (GFP-Lacl). (B) Confocal slices through a nucleus. The nuclear 
envelope is stained and a series of slices (numbered 1-5) are collected along the ^-axis. 
Not shown in this representation is the staining of the cortical endoplasmic reticulum, 
which is visible when the Sec63-13myc marker is used. (C) Selection of the optimal slice. 
Slices 1-5 from panel (B) are shown. Slice #3 has the brightest, most focused GFP-Lacl 
spot and is selected for scoring. The GFP-Lacl spot in this cell is scored as nucleoplasmic. 



restriction site that will be unique in the context of the LacO array plasmid. 
Table 22.1 shows a list of sites that are absent from the LacO array. 

Once a targeting sequence has been introduced into the LacO array 
plasmid, it is digested at a unique restriction site within the targeting 
sequence and transformed into yeast. We typically start with a yeast strain 
that has previously been transformed with the GFP-Lac repressor (GFP- 
Lacl) and, where necessary, additional tagged proteins localizing to other 
subnuclear domains (e.g., Sec63-13myc for the nuclear envelope/endoplas- 
mic reticulum). The LacO array should be introduced last because not all 
transformants will possess a full-length array. To identify transformants that 
possess the full-length array, we screen through four or five clones to 
identify those that exhibit a clear green dot of GFP fluorescence. Once 
we have confirmed that the array is intact, we create a frozen stock of 
this strain. 
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Table 22.1 Sites absent from the LacO array 
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2. Immunofluorescence 



Having tagged a locus of interest, it can be colocalized with respect to 
subnuclear structures either in live cells or in fixed cells. Chapter 21 describes 
methods for imaging chromosomal loci in live cells and defining their 
dynamic behavior. Here, we describe how to use immunofluorescence 
methods with populations of fixed cells to determine the localization of 
genes. The resulting localization represents a dynamic distribution and is 
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expressed as the fraction of the population in which the chromosomal locus 
and the protein marker. We focus on the localization of chromosomal loci 
with respect to the nuclear periphery or the nucleolus. We have successfully 
used several markers for the nuclear periphery. We have used the 9E10 anti- 
myc monoclonal antibody (Santa Cruz Biotechnology) to localize a 13myc- 
tagged Sec63 as a marker for the nuclear envelope (Brickner and Walter, 
2004). This protein localizes throughout the endoplasmic reticulum 
(Gilmore, 1991). We have also used the 32D6 anti-Nspl monoclonal anti- 
body from EnCor Biotechnology (Gainsville, FL) to stain nuclear pore 
complexes. This has the advantage that it does not require expression of a 
tagged protein. To stain the nucleolus, we have used the 37C12 monoclonal 
anti-Nop5/6 antibody from EnCor Biotechnology and we have used 
epitope-tagged versions of Spc42 as a marker for the spindle pole body. 




3. Fixing Cells 



Most immunofluorescence protocols use formaldehyde fixation. This 
works well for certain proteins and certain organelles. However, we have 
found that the shape of the nucleus can be poorly preserved by formalde- 
hyde fixation (Fig. 22.2). Therefore, we use methanol fixation. This fixation 
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Figure 22.2 Methanol fixation versus formaldehyde fixation. Shown are representa- 
tive examples of cells fixed using either methanol (as described in the protocol) or 4% 
formaldehyde (twice for 30 min). The ER/ nuclear envelope was stained with Sec63- 
myc (red) and the chromosomal locus (in this case, the INOl gene) was stained with the 
GFP-Lac repressor (green). Note the nonspherical shape of the nucleus in the formal- 
dehyde-fixed cells. 
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method causes cells to shrink somewhat but it preserves the spherical shape 
of the nucleus better than in formaldehyde-fixed cells. 

1. Grow cells to OD 600 ~ 0.3—0.7. 

Note: This protocol has been used to examine cells from many different 
media, including both rich and synthetic media. Overgrown cells are more 
difficult to spheroplast and stain. 

7 8 

2. Harvest 10—10 cells by centrifugation and discard the supernatant. We 
typically harvest a culture of 10— 15 ml at an OD 600 ~ 0.1—0.7. 

3. Suspend the cells in 1 ml of chilled methanol (store at —20 °C). 

4. Incubate at —20 °C for >20 min. 

5. Harvest cells by centrifugation and resuspend in 1 ml SHA (1 M sorbitol, 
50 mMHEPES, pH 6.8, 1 mMNaN 3 ). The fixed cells can be stored at 
4 °C (good for 4 or 5 days). 




4. Spheroplasting 

Before permeablizing cells, the cell wall must be removed. Once the 
cells are converted to fixed spheroplasts they should be processed for 
immunofluorescence immediately. 

_ 7 

1. Harvest 5 X 10 cells by centrifugation, 30 s. 

2. Resuspend in 1 ml SHA + 0.2% j6-mercaptoethanol. 

3. Add 2.5 jA (50 units) lyticase (Sigma catalog #L4025). 

4. Incubate at 30 °C for 30 min. Invert tubes occasionally. 

5. Harvest by centrifugation, 30 s. 

6. Resuspend in 0.5 ml SHA + 0.1% Triton X-100. Incubate 10 min at 
room temperature. 

7. Harvest by centrifugation, 30 s. 

8. Resuspend in 250 jA SHA. 




5. Preparing Slides 

Ten well slides from Carlson Scientific (Peotone, IL; catalog # 101007) 
are used. Depending on the configuration of the microscope, we usually do 
not use the column of wells closest to the edge of the slides because they are 
more difficult to image. Between strains or treatments, we leave a column of 
wells empty to avoid cross-contamination. For washes, we add buffers with a 
Pasteur pipette and carefully remove the buffers by aspiration (using low 
vacuum and holding the tip at an angle beside the well) . 
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1. Coat each well with 20 fA 0.1% polylysine (Sigma-Aldrich, catalog #P 
8920). 

2. Incubate > 15 min at room temperature. 

3. Remove by aspiration and let dry completely. 

4. Add 20 fA fixed spheroplasts to each well and let settle for 15 min. 
Aspirate to remove excess liquid. 

5. Wash cells twice with Buffer WT (1% nonfat dry milk 0.5 mg/ml BSA 
200 mM NaCl 50 mM HEPES-KOH (pH 7.5) 1 mM NaN 3 0.1% 
Tween-20) 




6. Antibody Incubations 

1. Dilute the primary antibodies into Buffer WT. 

Dilutions for antibodies we have used: anti-myc (1:200), anti-Nspl 
(1:200), anti-GFP (1:1000), anti-Nop5/6 (1:200), anti-HA (1:200). 

2. Add 15 /il/well of diluted 1° Ab in Buffer WT. 

3. Incubate 60—90 min at RT. 

This incubation can be carried out overnight at 4 °C. 

4. Wash five times with 20 fA Buffer WT. 

5. Dilute secondary antibodies in Buffer WT. 

We use Alexa Fluor 594 goat antimouse IgG (Invitrogen catalog #A- 
11032) and Alexa Fluor 488 goat antirabbit IgG (Invitrogen catalog #A- 
11008), diluted 1:200. 

6. Add 15 fA of diluted secondary antibody in Buffer WT. 

7. Incubate 60—90 min at room temperature in the dark. 

This incubation can be carried out overnight at 4 °C. 

8. Wash five times with 20 fA Buffer WT. 

To stain for DNA, include 0.3 fig/ ml DAPI in the third wash, incubate 
for 1 h at 4 °C (in the dark), and wash twice more. 




7. Mounting and Storage of Slides 

1 . After aspirating most of the liquid from each well on a slide (without 
letting them dry completely), quickly add 1—2 fA of Vectashield mount- 
ing solution (Vector Laboratories, catalog #H-1000). 
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Aspirate and add mounting medium to each well before moving on to 
the next well. 

2. Carefully cover wells with a clean 50 mm coverslip. 

The mounting medium should fill the wells without overflowing into 
neighboring wells. 

3 . Seal the slide by painting the seams with clear fingernail polish. Let dry in 
the dark. 

4. The slides may be stored in the refrigerator, but they look best when 
they are fresh. 




8. Microscopy and Analysis 

We use a Zeiss LSM510 confocal microscope with a 100 X 1.4NA objec- 
tive to image the cells. We have found that confocal imaging is the best system for 
accurate localization of genes within the nucleus. We routinely visualize fixed 
and stained cells using both a 30-mW 458/488/514 nm Argon laser and a 1-mW 
543 nm Helium Neon laser. The pinhole is set to 146 nm, with the detector gain 
set to 750—900 and the amplifier offset is ~— 0.3. It is not necessary to collect 
three-dimensional stacks or to reconstruct whole cells. We use the motorized 
stage control to step through the nuclear volume to find a confocal slice 
(~0.7 /im in z dimension) that displays the most intense and most focused 
spot for the lac repressor-GFP (schematized in Fig. 22. 1C; slice #3). Then we 
collect data from this slice in both channels. For experiments in which we are 
assessing peripheral localization, we only include cells in which the selected slice 
displays a clear ring staining for the nucleus (i.e., not at the top or the bottom of 
the nucleus, nor cells in which the nuclear envelope is incompletely stained). 

For each cell, we compare the localization of the GFP-Lac repressor 
(GFP-LacI) spot and the marker of interest (e.g., nuclear envelope or nucle- 
olus). In the case of the nuclear envelope, we bin cells into one of two classes 
(Brickner and Walter, 2004). If the center of the GFP spot is overlapping with 
the nuclear envelope, we classify the spot's localization as peripheral. If the 
center of the GFP spot is not overlapping with the nuclear envelope, we 
classify the spot's localization as nucleoplasmic. For each biological replicate, 
we collect data from 30 to 50 cells. For a given population, we determine the 
percentage of cells in which the spot localizes to the nuclear periphery. 

The light microscope has a resolution of 100— 200 nm in the X—Y dimen- 
sions (Born and Wolf, 1998; Pawley, 2006; Schermelleh et al, 2008). The 
radius of the haploid yeast nucleus is approximately 1 /im. Therefore, GFP 
spots within the shell corresponding to the outermost ~ 10% of the radius will 
be scored as peripheral in our assay. The fraction of cells in which the GFP spot 
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would be expected to localize at the periphery by chance can be calculated 
from the fraction of the total volume represented by this outermost 10% of the 
radius. The volume of sphere is biased to the outside; comparing spheres having 
radii of 0.9 jum versus 1 /mi, the small sphere has a volume (3. 1 jum ), that is 74% 
of the volume of the larger sphere (4.2 /im ). Therefore, this outermost shell 
corresponds to 26% of the volume of the nucleus and an unbiased distribution 
within the yeast nucleus should result in ~ 26% colocalization with the nuclear 
membrane. This level of peripheral localization is the baseline for this assay. 
Peripherally localized chromosomal loci score as peripheral in 60—85% of cells 
using the scoring criterion above. For example, genes such as INOl and GAL1 
that are recruited to the nuclear periphery upon activation, localize at the 
periphery in ~ 30% of cells when they are repressed and in ~ 65% of cells when 
they are activated (Brickner and Walter, 2004; Brickner et ah, 2007; Casolari 
et ah, 2004, 2005; Schmid et ah, 2006). More stably associated peripheral loci 
such as telomeres localize at the nuclear periphery in ~ 85% of cells using this 
assay (our unpublished data). 

To determine the variance in the peripheral localization for a particular 
locus, we use three or more biological replicates (i.e., cells harvested from 
independent cultures). From these measurements we determine the mean 
value and the standard error of the mean. As a negative control for peripheral 
localization, we use the URA3 gene, a nucleoplasmic locus that localizes at 
the nuclear periphery in ~30% of cells (Brickner and Walter, 2004). To 
determine if peripheral localization is statistically significant, we use an 
unpaired £-test to compare the percentage of cells with peripheral localization 
of the locus of interest with the percentage of cells with peripheral localization 
of the URA3 gene. In a typical experiment, the URA3 gene localizes at the 
nuclear periphery in 30% =b 5% of cells and a peripheral gene like INOl 
localizes at the nuclear periphery in 65% =b 5% of cells (P = 0.0078). 

We have used a similar strategy to localize chromosomal loci with 
respect to the nucleolus. The nucleolus in yeast is a single, crescent-shaped 
organelle that usually aligns opposite the spindle pole body (Stone et ah, 
2000; Yang et al, 1989). The rDNA repeats and tRNA genes localize 
within the nucleolus (Thompson et ah, 2003). Recent work using high- 
resolution probabilistic analysis has shown that, although most RNA poly- 
merase II-transcribed genes are excluded from the nucleolus, some localize 
within the nucleolus (Berger et ah, 2008). In particular, the GAL2 gene 
localizes within the nucleolus and becomes localized at the periphery of the 
nucleolus upon activation. We have also observed the nucleolar localization 
of GAL2 using immunofluorescence (Fig. 22.3). Immunofluorescence 
using commercially available antibodies against the Nop5 and Nop6 pro- 
teins defines the nucleolus, which typically occupies approximately one- 
third of the nuclear volume and appears as a crescent or spherical shape. 
Using the same confocal microscopy analysis, we have quantified the colo- 
calization of GAL2 or INOl with the nucleolus. The GAL2 gene colocalizes 
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Figure 22.3 Nucleolar colocalization. Cells were fixed and probed with anti-GFP and 
anti-Nop5/6 as described in the protocol. In the cells shown in the top panels, the LacO 
array was integrated beside the INOl gene. In the cells shown in the bottom panels, the 
LacO array was integrated beside the GAL2 gene. Note the clear separation of the 
INOl gene from the nucleolus and the localization of GAL2 within the nucleolus. 

with the nucleolus in most (~90%) of the cells in the population (Fig. 22.3; 
Gard et al., 2009). In contrast, the INOl gene localizes within the nucleolus 
in rsj 10% of cells (Fig. 22.3; our unpublished data). Therefore, although the 
nucleolus occupies a significant fraction of the nuclear volume, the baseline 
colocalization of chromosomal loci with this structure is lower than 
expected. This suggests that cells in which a nonnucleolar chromosomal 
locus appears nucleolar because it is above or below the nucleolus is not a 
major source of background in these measurements. Therefore, this method 
readily distinguishes between two classes of RNA polymerase II-transcribed 
genes that differ in their localization with respect to the nucleolus. 

The methods described in this chapter allow colocalization of chromo- 
somal loci with respect to two major subnuclear compartments. It remains 
to be seen what fraction of the yeast genome localizes to particular subnu- 
clear domains. It is conceivable that there are few parts of the genome that 
are randomly localized. As the spatial organization of the nucleus becomes 
better understood, we anticipate that these methods can be adapted 
and improved to allow the localization of loci with respect to additional 
subnuclear compartments. 
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Abstract 

Spinning-disk confocal microscopy is an imaging technique that combines the 
out-of-focus light rejection of confocal microscopy with the high sensitivity of 
wide-field microscopy. Because of its unique features, it is well suited to high- 
resolution imaging of yeast and other small cells. Elimination of out-of-focus 
light significantly improves the image contrast and signal-to-noise ratio, making 
it easier to resolve and quantitate small, dim structures in the cell. These 
features make spinning-disk confocal microscopy an excellent technique for 
studying protein localization and dynamics in yeast. In this review, I describe 
the rationale behind using spinning-disk confocal imaging for yeast, hardware 
considerations when assembling a spinning-disk confocal scope, and methods 
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for strain preparation and imaging. In particular, I discuss choices of objective 
lens and camera, choice of fluorescent proteins for tagging yeast genes, and 
methods for sample preparation. 




1. Introduction 

In recent years, fluorescent protein tagging combined with microscopy 
has become a powerful tool for imaging subcellular localization in live yeast 
cells (Davis, 2004; Kohlwein, 2000; Rines et ah, 2002). A particularly 
popular imaging method is spinning-disk confocal microscopy, which 
provides a way around one of the fundamental limitations of fluorescence 
microscopy: the objective lens captures light not only from the region of the 
sample that is in focus but also from regions in the sample above or below 
the focus plane. This out-of-focus light reduces contrast in the image and 
obscures in-focus information. 

Spinning-disk confocal microscopy has become such a powerful tool for 
imaging yeast because it provides much better optical sectioning than a 
conventional fluorescence microscope, but much better sensitivity that a 
laser-scanning confocal. In general, laser-scanning confocal microscopes 
lack the sensitivity required to image small dim objects, and are unlikely 
to produce good images of any yeast sample that is not very bright. 
Conversely, conventional fluorescence microscopes can be very sensitive 
but do not reject out-of-focus light and so cannot acquire optical sections or 
produce high-resolution 3D reconstructions of the cell. Spinning-disk 
confocals can acquire high-resolution optical sections with good sensitivity 
and thus fill a niche between conventional fluorescence microscopes and 
laser-scanning confocals. They are particularly good when imaging small 
dynamic processes in vivo, such as cytoskeletal dynamics or vesicle and 
organelle movement. For imaging requiring less resolution, such as looking 
at translocation between the cytoplasm and nucleus or the abundance of 
a transcriptional reporter, they have relatively little advantage over a 
conventional fluorescence microscope. 

There are two general ways to deal with out-of-focus light; it can either 
be blocked from reaching the detector, as in confocal microscopy, or it can 
be computationally removed after the fact, as in deconvolution microscopy 
(Shaw, 2006). In confocal microscopy, a pinhole in the excitation light path 
excites a single spot in the sample. A confocal pinhole in the detection path 
then blocks all light that does not originate from the excited spot. Traditional 
laser-scanning confocal microscopes suffer from two drawbacks for live cell 
work. First, because they require scanning a single illumination spot over 
the sample, they are typically slow, acquiring approximately 1 frame per 
second (fps). Second, and more importantly, due to the detectors used and 
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to the very short integration times at each pixel in the image, they are 
relatively insensitive. 

Spinning- disk confocal microscopes (reviewed in Toomre and Pawley, 
2006) avoid both of these problems by replacing the single pinhole in a laser- 
scanning confocal with multiple pinholes that simultaneously illuminate many 
points in the sample. These pinholes are arranged in a spiral pattern on a disk so 
that as the disk rotates, the pinholes sweep over every point in the sample, 
illuminating it uniformly. This greatly improves the speed of the spinning-disk 
confocal (imaging at 30 fps is routine) and, as the confocal now illuminates the 
entire field of view in a short period of time (typically 33 ms) , it forms an image 
that can be recorded on a highly sensitive CCD camera (as opposed to an 
insensitive photomultiplier tube, as in a laser-scanning confocal). The net 
result is that spinning-disk confocal microscopy systems are both faster and 
more sensitive than laser-scanning confocals. The increase in sensitivity is 
striking; a spinning-disk confocal system can collect as many as 50 times 
more photons for a given exposure than a conventional laser-scanning confo- 
cal (Murray et ah, 2007). Other confocal techniques, such as slit or line 
scanners, share some of the same advantages as spinning-disk confocal but 
have not been as extensively validated and so will not be discussed further. 

One common misconception about confocal microscopes, including 
spinning-disk confocals, is that a confocal system has higher resolution 
than a nonconfocal system. In theory, this is not true, in that a point-like 
object will be imaged to a blurry disk of the same width in both systems. 
That is, the width of the point spread function of both a confocal and a 
nonconfocal system will be the same. In practice, however, the achievable 
resolution of a microscope is often lower than theoretically expected due to 
a low signal-to-noise ratio of the image. Low signal-to-noise ratios can arise 
from weakly fluorescent samples, but out-of-focus light can also reduce the 
signal-to-noise ratio even on bright samples. Out-of-focus light contributes 
background to the image that reduces contrast and adds additional noise, 
which reduces the overall signal-to-noise ratio of the image. Thus, while 
the spinning-disk confocal does not improve the diffraction-limited resolu- 
tion of the microscope, it may improve the achievable resolution in real 
samples if this is limited by out-of-focus light contributing background 
fluorescence and noise to the images. 




2. Building a Spinning-Disk Confocal 
Microscope 

A photograph of the spinning-disk confocal in the Nikon Imaging 
Center at UCSF, with components labeled, is shown in Fig. 23. 1 A. A schematic 
of the major components in the optical path is shown in Fig. 23. IB. 
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Figure 23.1 Overview of a spinning-disk confocal system. (A) Photograph of the 
spinning-disk confocal system in the Nikon Imaging Center at UCSF. It consists of a 
Nikon TE2000 microscope with a temperature controlled chamber from In Vivo 
Biosciences. The Yokogawa CSU-22 spinning-disk scanhead is labeled, along with 
the optical fiber bringing excitation light to the scanhead from the laser launch. A Sutter 
filter wheel holding emission filters is coupled to the exit of the scanhead; a projection 
lens is mounted after the filter wheel. The Cascade II EMCCD camera (Photometries) is 
labeled as well. (B) Schematic diagram of the spinning-disk optical path. Solid lines 
show the excitation path, and the short dashed lines show the emission path. Elements 
within the dashed box are contained within the spinning-disk confocal scanhead. 
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2.1. Microscope base 

Spinning-disk confocal systems can be built on microscopes from any of the 
major manufacturers, and on either an upright or inverted system, although 
mounting components is somewhat simpler on an inverted scope. Micro- 
scopes from all of the major manufacturers (Leica, Nikon, Olympus, and 
Zeiss) have excellent optical performance so the major considerations in 
choosing one are software support (not all software packages support all 
microscopes) and the availability of other features on the microscope. Most 
of the major microscope manufacturers have recently introduced hardware 
autofocus systems on their microscopes, such as the Nikon Perfect Focus 
System, the Zeiss Definite Focus, or the Olympus Zero Drift. These are 
devices that measure the position of the sample coverslip and adjust the 
focus to compensate for any movement of the sample, thereby maintaining 
focus at all times. These autofocus systems greatly improve focal stability and 
are considerably more precise and faster than image-based autofocus tech- 
niques. They also greatly simplify the acquisition of time-lapse movies, and 
if you will be acquiring time-lapse data for more than approximately 
30 min, are well worth the added expense. 

2.2. Scanhead 

While a number of disk-scanning confocal systems exist, spinning-disk 
confocals from the Yokogawa Electric Corp. have come to dominate the 
market because they include microlenses that focus the laser light through 
the pinhole disk, dramatically increasing the excitation efficiency of the 
system (Toomre and Pawley, 2006). Other spinning-disk confocal systems 
do not include these microlenses, making them much less bright at equiva- 
lent laser powers. There are two different versions of the Yokogawa 
spinning-disk confocal scanhead: the older CSU10 and the newer 
CSU-X1. The CSU-X1 has several improvements over the CSU10, includ- 
ing a redesigned excitation path that increases the amount of excitation light 
reaching the sample and a higher disk rotation rate that allows for frame rates 
faster than 30 fps. The CSU-X1 also has an optional bypass path that allows 
bypassing the spinning disk for conventional wide-field imaging, and an 
optional motorized filter changer to allow automated filter switching in the 
scanhead. This is helpful if you intend to build a system with a large number 
of laser lines. 



2.3. Lasers and filters 

Typically, the vendor that provides the spinning-disk scanhead will also 
provide the lasers and laser launch for the system. The laser launch contains 
the optics to combine the beams from multiple lasers together and to launch 



586 Kurt Thorn 

them into the single mode optical fiber attached to the spinning-disk scan- 
head. It will also typically have an acousto-optic tunable filter (AOTF) for 
rapidly switching laser lines. For spinning-disk confocal microscopy, it is 
probably best to avoid gas lasers (with the possible exception of Ar-ion 
lasers). The latest solid state lasers typically have sufficient power for confo- 
cal microscopy and are smaller, produce less heat, and have longer lifetimes 
than gas lasers. Typically, a 50-mW laser will supply sufficient power for 
spinning-disk applications, although some applications (particularly rapid 
imaging) may benefit from increased laser power. While the exact laser 
wavelengths used will depend on which fluorescent tags you wish to image, 
commonly used wavelengths include 405 nm (DAPI), 440 nm (CFP), 
473 nm, 491 nm (GFP), 532 nm, 561 nm (RFP), and 640 nm (Cy5). 

You will need a dichroic beamsplitter(s) in the scanhead that matches the 
laser lines you are using. The dichroic beamsplitter separates the emission 
light from the excitation light, allowing the emitted light to be imaged onto 
the camera. Additionally, for multicolor imaging, you will want a filter 
wheel at the exit of the scanhead with emission filters that define the 
wavelength band you wish to detect for each channel (Fig. 23. IB). This 
can either be a third-party filter wheel or a CSUX optional add-on filter 
wheel. Alternatively, if you are using a small number of wavelengths this can 
be replaced with a single multipass emission filter but this may result in cross 
talk between different channels. Multipass emission filters are, however, 
useful for rapid multicolor imaging by changing the excitation wavelength 
as they eliminate the need for moving any mechanical parts to change 
wavelength. Filters with very high transmission in the passband (>90%) 
are now available from the major vendors (Chroma Technology, Omega 
Optical, and Semrock) and are highly desirable to maximize the amount of 
light detected by the camera. 



2.4. Choice of objective 

Typically, for spinning-disk confocal microscopy of yeast cells one wishes to 
maximize the resolution of the images acquired. Ultimately, the achievable 
resolution is limited by the numerical aperture (NA) of the objective lens 
used to image the sample; the larger the numerical aperture, the higher the 
achievable resolution. The standard measure of resolution is given by the 
Rayleigh criterion, which measures the minimum distance two point 
sources must be separated by to be distinguished, and is given by r min = 0.61 
A,/NA ob j, where X is the wavelength of light emitted by the sample (Inoue, 
2006). Maximizing resolution, therefore, suggests the use of oil-immersion 
objectives with a numerical aperture of 1.4 or higher. Plan Apo objectives 
(corrected for both field flatness and chromatic aberration at multiple wave- 
lengths; Keller, 2006) are available from all major manufacturers with a 
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numerical aperture of 1.4. An objective with a numerical aperture of 1.4 will 
have a resolution limit of ^220 nm for GFP. 

Such objectives are also well matched to the pinholes in the spinning- 
disk confocal. Because an objective has a finite resolution, the image of a 
point object smaller than the diffraction limit of the objective is blurred into 
a disk (the Airy disk) of radius r min = 0.61^/NA ob j. For optimum confo- 
cality, the diameter of the confocal pinhole should match the diameter of 
this Airy disk times the magnification of the objective. When these two 
diameters are equal, the confocal pinhole will pass all of the in-focus light 
while rejecting the maximum amount of out-of-focus light. If the pinhole 
diameter is smaller than the Airy disk diameter, in-focus light will be 
blocked by the pinhole, reducing the overall efficiency of the system. If 
the pinhole is larger than the Airy disk, additional out-of-focus light will 
pass through the pinhole, reducing the confocality and contrast of the 
system. The Yokogawa CSU spinning-disk confocal has pinholes of diam- 
eter 50 /im, which are well matched to the Airy disk diameter of ~44 fim 
produced by a 100 x/ 1.4 NA objective imaging GFP. 

One drawback of using oil-immersion objectives for imaging yeast is 
that using an oil-immersion objective to image into a yeast cell in an 
aqueous environment introduces spherical aberration due to the refractive 
index difference between the immersion oil and the aqueous medium. An 
oil-immersion objective is designed to produce aberration-free images 
immediately adjacent to the coverslip. As one images deeper into the 
sample, the thickness of the immersion oil layer will shrink, and the 
thickness of the water layer will increase, resulting in a change in the optical 
properties of the system which gives rise to spherical aberration 
(Fig. 23. 2 A). When viewed through a wide-field system (e.g., the eyepieces) 
spherical aberration manifests as an axial asymmetry in the point spread 
function, and can be recognized as haloing around point-like objects on one 
side of focus, with no haloing on the other side of focus (Fig. 23. 2B). 
Spherical aberration results in the detected light from the sample no longer 
coming to a tight focus which reduces contrast and intensity as the broader 
focal spot of the spherically aberrated light from deeper in the sample will 
be partially cut off by the spinning-disk pinholes. This effect can be quite 
noticeable even for objects as thin as a yeast cell. Indeed, measurable 
intensity falloff can be seen after imaging as little as 2 /im into the sample 
(Fig. 23.2C). 

This spherical aberration can be eliminated in one of two ways. First, 
using a water- immersion objective instead of an oil-immersion objective 
will eliminate the change in spherical aberration with imaging depth as now 
the oil layer is replaced with a water layer having the same refractive index as 
the specimen, so that as the optical properties of the system no longer 
change as you image deeper into the specimen. A water-immersion objec- 
tive will have a lower numerical aperture than an oil objective due to the 
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Figure 23.2 Spherical aberration. (A) Oil-immersion objectives are designed to focus 
at the coverslip (redrawn from a figure by Mats Gustafsson). Under these conditions, 
shown in the left-hand portion of the figure, all light rays entering the sample focus to a 
single point. When imaging into an aqueous sample, as shown on the right, refraction 
occurs at the boundary between the coverslip and the solution. This additional refrac- 
tion differs for rays entering at different angles, and as a result the light is no longer 
tightly focused to a single spot. This is spherical aberration. (B) Unaberrated and 
aberrated images of beads, acquired on an epifluorescence microscope. Spherical 
aberration can be recognized by haloing around point objects on one side of focus. 
On the left is shown a series of Z slices of a 100 nm bead, taken at 0.5 /urn intervals. 
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lower refractive index of the immersion medium which will result in lower 
resolution for the objective. However, as the spherical aberration intro- 
duced by using an oil objective also reduces resolution, the unaberrated 
water-immersion objective may have higher resolution than the aberrated 
oil objective. As the sample gets thicker, the benefit from using the water- 
immersion objective instead of the oil-immersion objective increases, as the 
induced spherical aberration increases with the distance from the coverslip. 
An alternate approach to eliminate spherical aberration is to add addi- 
tional optics that correct for the spherical aberration. In principle, this can 
be done with adaptive optics approaches, but this remains highly challeng- 
ing (Booth, 2007). A more practical approach is to introduce additional 
lenses that move as the objective moves in Z to compensate for the induced 
spherical aberration. One such device is the MID /SAC spherical aberration 
corrector available from Intelligent Imaging Innovations, which has been 
used to great effect for imaging samples up to 80 fim thick with a spinning- 
disk confocal microscope (McAllister et ah, 2008). For imaging thin yeast 
samples, however, such an approach is probably overkill. An alternative 
approach is to use an objective with a correction collar, which allows the 
optics inside the objective to be adjusted to eliminate spherical aberration at 
any single focal plane. While such a correction collar (unless continuously 
adjusted) cannot correct for spherical aberration at all focal planes in the 
sample, setting it to minimize spherical aberration in the middle of the yeast 
cell minimizes spherical aberration over the thickness of the cell and has been 
found to give improved quantitation of fluorescent intensities over the full 
thickness of the cell (Fig. 23. 2C). Oil-immersion objectives with correction 
collars are typically designed for total internal reflection fluorescence 
microscopy and so additionally benefit from having a slightly higher numer- 
ical aperture (up to 1.49). For applications where accurate three-dimensional 



These were taken under conditions of minimal spherical aberration, and the image of 
the bead (the point spread function) is relatively symmetric above and below focus. On 
the right is an image of the same bead with spherical aberration induced; the Z slices are 
now taken at 1 /mi intervals. The pronounced asymmetry between focusing above and 
below the bead is now apparent. Observing small structures in a sample and adjusting 
the correction collar to minimize the asymmetry in the point spread function above and 
below focus is a good way to minimize spherical aberration in your images. (C) The 
effect of spherical aberration on quantitative intensity measurements with a spinning- 
disk confocal (data provided by Susanne Rafelski). A 2-^m bead was imaged with both 
a Plan Apo 100x/1.4 NA objective (without a correction collar) and an Apo TIRF 
100x/1.49 NA objective, with the correction collar set to minimize spherical aberra- 
tion. The average intensity from a small region in the center of the bead is plotted as a 
function of Z. The intensity falls off rapidly when imaging into the sample with the Plan 
Apo objective, but is much more symmetric when using the TIRF objective with 
correction collar. The image of the bead acquired with the correction collar adjusted 
to minimize spherical aberration better matches the actual brightness of the bead, which 
should be uniform throughout its thickness. 
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intensity measurements are important, an oil- or water-immersion objective 
with a correction collar is highly recommended. Setting the correction 
collar is done by adjusting it to achieve symmetry in the point spread 
function above and below focus around the middle of the cell. This is 
done by observing the point spread function of a small object in the cell 
through the eyepieces while adjusting the correction collar. One of the 
drawbacks to the use of objectives with correction collars is that setting the 
collar accurately takes practice, and if the collar is set inaccurately it will 
introduce additional spherical aberration. For this reason, we have both a 
100 X /l .4 NA oil lens (without a correction collar) and a 100 X /l .49 NA oil 
lens (with a correction collar) on our spinning-disk confocal. The 1.4 NA 
lens is used for routine applications and the 1.49 NA lens is used for more 
demanding 3D reconstructions. For thicker specimens, such as yeast bio- 
films, a water-immersion lens may give more satisfactory performance. 
While the specific objectives mentioned here are Nikon objectives, other 
microscope manufacturers have generally similar objectives. 

For applications where high resolution and high contrast are critical, it is 
probably best to compare both oil- and water-immersion objectives. The 
achievable resolution in a spinning-disk confocal experiment depends on 
the numerical aperture, magnification, and aberrations of your microscope 
in a complex way, and the solution with the best performance is not always 
predictable a priori. In particular, while water-immersion objectives have 
lower magnification and numerical aperture, and would therefore be 
expected to have lower resolution, they also have lower spherical aberra- 
tion, which reduces contrast and degrades resolution. In a real experiment, 
the achievable resolution depends not only on the theoretical resolution of 
the objective but also on the intensity of the signal and the background, and 
on any aberrations, and so can only be determined empirically. 

2.5. Cameras 

For any spinning-disk confocal system, you will need a camera to acquire 
images for analysis and publication. The camera should be highly sensitive, 
so as to capture as many photons as possible that arrive at it, and low noise, so 
as to contribute as little extra noise to the image as possible. Camera 
sensitivity is measured by the quantum efficiency of the camera, which 
measures the fraction of photons incident on the CCD that are recorded. 
For spinning-disk confocal microscopy, particularly on dim specimens, 
cameras with very high quantum efficiencies are desirable, so that nearly 
every photon collected by the microscope is recorded on the camera. The 
highest possible quantum efficiencies are achieved by back-thinning of the 
CCD chip, whereby the substrate the chip is grown on is physically ground 
away and the CCD is illuminated from the back. Illuminating the CCD 
chip from the back eliminates absorption and scattering by the electronics 
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fabricated on the front face of the CCD and allows quantum efficiencies in 
the visible wavelength range of ^90%. By contrast, nonback-thinned 
CCDs (front-illuminated CCDs) have typical quantum efficiencies of 
-60% (Art, 2006; Janesick, 2001). 

Noise in the image comes from a number of sources, of which two are 
dominant: photon shot noise and camera read noise. Photon shot noise 
occurs because photons are collected in integer numbers (so that while a 
given pixel may record 10 photons on average, sometimes it will record 
more or less) and is unavoidable. The standard deviation due to the photon 
shot noise is given by the square root of the number of photons collected, 
and so the signal-to-noise ratio is given by one over the square root of the 
number of photons collected. Thus, the only way to reduce the photon shot 
noise is by collecting more photons, for example, by exposing longer. 

Camera read noise is noise introduced by the digitization process in the 
camera. This introduced noise is proportional to the readout speed of the 
camera — the faster the camera is read out, the higher the read noise. 
A typical camera that can be readout at ~10 fps, such as the Coolsnap 
HQ2 (Photometries) or Orca-R2 (Hamamatsu) has ~6— 8 e~ of read noise. 
Very low read noise can be achieved with a very slow readout speed. For 
instance, the Orca-II-ER (Hamamatsu) has a read noise of 4 e~, but takes 
1.2 s to read out a single image. For many applications of spinning-disk 
confocal microscopy, such a frame rate is prohibitively slow. 

To achieve a fast frame rate while maintaining low noise, the solution 
has been to amplify the signal before reading it out. Since the signal-to-noise 
ratio depends on both the signal strength and the amount of noise, the 
signal-to-noise ratio can be increased either by amplifying the signal or by 
reducing the noise. Either amplifying the signal or reducing the noise by the 
same amount results in the same increase in the signal-to-noise ratio. 
Therefore, amplification can be thought of as reducing the read noise by 
the same factor, which is why amplified cameras often quote an effective 
"read noise" of < 1 e . This was first done by using an image intensifier 
attached to the CCD, a so-called intensified CCD or ICCD. The image 
intensifier greatly amplifies the number of photons prior to their arrival at 
the CCD (by up to a million-fold) rendering even the high read noise 
typical of a fast camera negligible. However, ICCDs suffer from a number 
of drawbacks including low quantum efficiency and potential damage due 
to brief exposure to bright light, so they have been largely supplanted for 
routine imaging by a newer technology, the electron multiplying CCD 
(EMCCD). 

An EMCCD is essentially a normal CCD with a gain register located 
prior to the readout electronics. The photoelectrons from the CCD are 
shifted through the wells of this register at higher than normal voltages, 
resulting in a gain of ~ 1% for each transfer step. That is, after a single shift 
event through this register, 100 electrons will on average be amplified to 
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101 electrons. The gain register consists of several hundred to a thousand 
wells, so shifting through the entire register results in very high gains, up to 
1000-fold (Art, 2006; Janesick, 2001). This method of amplifying the signal 
has several advantages over that of an ICCD. First, because the gain is 
located after the light-sensitive CCD, rather than in front of it like an 
image intensifier, the high quantum efficiency (> 90%) of a back- thinned 
CCD can be realized. Second, the amplification process introduces remark- 
ably little extra noise. The gain register does introduce additional shot noise, 
however, as the amplification process is random. As this gain noise has the 
same statistical properties as the photon shot noise, the resulting shot noise 
in the signal is increased, as if fewer photons had been collected. The net 
effect of this is to make it appear as if half as many photons had been 
gathered, so that the camera's quantum efficiency appears to be halved. 

Imaging quickly necessitates short exposure times (hence few photons) 
and cameras with fast readout (hence high read noise). Under these condi- 
tions, noise in the image will be dominated by the camera read noise, and 
amplification, which reduces the effective read noise, will substantially 
improve the final image quality. This is often the regime that spinning- 
disk confocal microscopy experiments fall in, as a typical experiment 
involves acquiring rapid Z-stacks in time-lapse, often necessitating expo- 
sures of 100 ms or less. For this reason, most spinning-disk confocal systems 
are now paired with EMCCD cameras. However, if you are imaging bright 
samples and do not need to image quickly, it may also be worth considering 
conventional CCD cameras as well. 

EMCCDs are made by three major manufacturers: Roper Scientific, 
Andor Technologies, and Hamamatsu. The cameras from all three companies 
tend to be similar, as they all utilize the same EMCCD chips. The most popular 
EMCCDs (e.g., Photometries Evolve, Andor Ixon, and Hamamatsu 
ImagEM) use a 512 X 512 pixel back-thinned EMCCD chip from e2v, 
which can be read out at video rate (30 fps). There is also a Ik X Ik pixel 
back- thinned EMCCD available, but it can be only read out at 8 fps and it is 
substantially more expensive than the 512 X 512 pixel EMCCDs. There is 
also a front illuminated EMCCD available, but this should probably be 
avoided due to its substantially lower quantum efficiency compared to the 
back-thinned EMCCDs. 

To achieve optimal resolution of a microscopy system, the magnification 
of the image on the camera must be chosen to match the resolution limit of 
the objective. Specifically, for the camera to be able to accurately resolve an 
object of a given size, that object must span at least two pixels on the camera; 
this is known as Nyquist— Shannon sampling (Pawley, 2006). Therefore, for 
your camera to achieve the full resolution that your objective is capable of, 
you need to magnify the image on the camera such that one camera pixel 
covers a distance of less than half the resolution limit of the objective when 
referred to the object plane. For example, a lOOx/1.4 NA objective has a 
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resolution limit of ~ 220 nm for GFP, so one pixel should span < 110 nm, 
referred to the object plane, to achieve maximum resolution. For a camera 
with 16 /im pixels, one pixel will correspond to a size of 160 nm in the 
object plane with no magnification beyond that of the 100 X objective. 
Therefore, to achieve maximal resolution, additional magnification must be 
introduced; this is most easily done by replacing the projection lens of the 
confocal scanhead with a lens of longer focal length. It is often useful to 
include additional magnification beyond that required by Nyquist— Shannon 
sampling; often 2.5—3 pixels per resolution unit are helpful, and for some 
digital image analysis procedures, additional magnification beyond this 
("empty magnification") may be helpful. 

2.6. Other hardware considerations 

For doing time-lapse measurements of yeast cells, a few additional compo- 
nents will be helpful. Most likely, you will want some kind of temperature 
control system so that your cells can be grown at a constant temperature. 
Temperature control can either be achieved by enclosing the entire micro- 
scope in a thermostatted plexiglass box, or by using a stage top heater 
combined with an objective heater. While enclosing the microscope in a 
temperature controlled box makes access to the microscope and the sample 
somewhat cumbersome, it does maintain a very stable environment around 
the entire microscope (typically stable to within ±0.1 °C). This tempera- 
ture stability helps minimize focus drift due to thermal fluctuations and 
makes switching samples and objectives simple. Stage top incubators are 
much smaller and cheaper and do not encumber the microscope at all. 
However, for use with oil-immersion objectives, they must be paired with 
an objective heater as otherwise the objective acts as a heat sink and will 
chill the sample. As use of an objective heater makes changing objectives 
difficult, and a stage top incubator will typically only hold a single size of 
sample dish, this approach adds its own set of difficulties. 

Another addition which is desirable for time-lapse data acquisition is a 
motorized X— Y stage. This greatly increases the number of cells that can be 
recorded in time-lapse by recording multiple fields of view in succession. 
As typically it will only take 10—15 s to move to a new field of view, 
autofocus with a hardware autofocus device, and acquire images, a motor- 
ized stage will allow tens of fields to be collected during the typical 5 min 
interval of a time-lapse movie. 

For cases where fast Z-stack acquisition is required, a piezoelectric 
Z-stage is a necessary addition. Addition of a piezoelectric Z-stage to either 
a motorized or a manual stage allows Z-stack acquisition at up to 30 fps if 
your camera is fast enough and your exposure time is short enough. 

To minimize vibration and sample drift, you will want to place the 
entire system on a vibration isolation table. 
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2.7. Software 

As the spinning-disk scanhead is a relatively passive device (you can set the 
rotation speed and internal filters), it can be controlled by a large number of 
software packages. Therefore, the software package you choose to control 
your microscope will depend more on the other components in your 
system, particularly the microscope body. Most microscope manufacturers 
now have their own software packages for controlling their microscopes, 
which should also be capable of controlling the other components in their 
systems. In addition, there are a number of third-party software packages 
that can control most instrumentation. Metamorph, from Molecular 
Devices, has been a standard third-party software package for many years, 
and controls a large variety of hardware. More recently, a free, open-source, 
microscope control package called Micro-Manager has been released. This 
is an appealing choice for labs looking to save money or who wish to 
customize their software. Microscope control software will usually include 
image analysis tools as well and so is generally a good starting point for data 
analysis. 



2.8. System integration 

In most cases, it does not make sense to purchase all the items described here 
individually. You will want to work with a system integrator who will sell 
you the scanhead, lasers and optics, and additional microscope components 
you need and will assemble the system and get it working. You may need to 
purchase the microscope stand separately and then add the spinning-disk 
confocal from a separate vendor. Two system integrators we have had good 
luck with are Solamere Technologies and Andor Technology. 




3. Sample Preparation 

3.1. Fluorescent tagging and choice of fluorescent protein 

One of the major strengths of budding yeast as a cell biological model 
organism is the ease with which genes can be tagged with fluorescent 
proteins. Homologous recombination is extremely efficient in budding 
yeast allowing a fluorescent protein sequence to be integrated nearly any- 
where in the genome using a 40-bp sequence to provide targeting. This 
enables tagging of a specific gene by amplification of a template with PCR 
primers containing targeting sequences for the gene to be tagged. Providing 
methods for performing gene tagging is beyond the scope of this review, but 
protocols are widely available (Amberg et ah, 2005; Gauss et ah, 2005; Knop 
et ah, 1999; Longtine et ah, 1998; Petracek and Longtine, 2002). Because of 
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the ease of fluorescent protein tagging in yeast, these are probably the most 
common fluorescent probes used, although other probes can be used as well 
(Giepmans et ah, 2006). 

A wide variety of fluorescent proteins (FPs) have been discovered and 
engineered recently. While a comprehensive discussion of available fluores- 
cent proteins and their properties is not possible given space constraints, 
several excellent reviews are available (Shaner et ah, 2005, 2007; Straight, 
2007; Zacharias and Tsien, 2006). Here, I focus on FPs readily available in 
Saccharomyces cerevisiae tagging vectors and newer proteins with desirable 
properties (Table 23.1). Tagging vectors are readily available for CFP, GFP, 
and YFP, and vectors are also available for mCherry, mTFPl, and Sapphire; 
unpublished vectors also exist for other fluorescent proteins. 

When choosing a fluorescent protein for spinning-disk confocal imaging 
of yeast, there are several factors to take into account. Most important is the 
detectability of the protein — how bright will a tagged protein be relative to 
the background fluorescence of the cell? Additional factors are the availability 
of a laser line well matched to the excitation spectrum of the fluorescent 
protein, the photo stability of the fluorescent protein, and whether or not the 
fluorescent protein perturbs the function of the protein to which it is fused. 

Detectability of a fluorescent protein is a function of both the brightness of 
the tagged protein and the autofluorescence of the cell. The intrinsic bright- 
ness of the FP plays a major role as, if all other factors are held constant, the 
brighter the protein, the more detectable it will be. This is easily calculated as 
the product of the extinction coefficient, which measures how efficiently the 
fluorophore absorbs light, and quantum yield, which measures how many 
absorbed photons are reemitted as fluorescence, of the FP; values are given in 
Table 23.1. Additionally, the maturation rate of the FP will directly affect the 
brightness of the tagged protein, as a protein with a slow maturation rate will 
be degraded and diluted by cell growth faster than it matures, leading to a 
majority of the protein being nonfluorescent. While maturation rates are not 
known for all FPs, and are frequently measured in vitro or in E. coli, conditions 
that may not be relevant to yeast expression, the proteins listed in Table 23.1 
generally have reasonably fast maturation rates (30 min or less). Similarly, 
translation efficiency will influence the brightness of a fusion protein; I have 
found that codon optimization of the FP can give a twofold increase in 
brightness (Sheff and Thorn, 2004), although mRNA secondary structure 
may play a larger role (Kudla et ah, 2009). 

The other component of detectability is cellular autofluorescence; the 
brighter the cellular autofluorescence, the brighter a tagged protein will 
have to be detected above this background. Careful choice of yeast strain 
and growth conditions can reduce autofluorescence (see below), but it 
cannot be completely eliminated. Much of the autofluorescence in yeast is 
due to flavins, which absorb broadly in the violet— blue range (400—500 nm) 
and emit in the green (~530 nm). Yeast autofluorescence is therefore 



Table 23.1 Fluorescent proteins 



Protein 


References 


^ex 


^em 


s (M~ 1 cm" *) 


QY 


Brightness a 


Tagging vectors 


TagBFP 


Subach et al. (2008) 


402 


457 


52,000 


0.63 


32.8 




Sapphire 


Cubitt et al (1999) 


399 


511 


29,000 


0.64 


18.6 


Sheff and Thorn (2004) 


T-Sapphire 


Zapata-Hommer and 
Griesbeck (2003) 


399 


511 


44,000 


0.6 


26.4 




ECFP 


Tsien (1998) 


433 


475 


32,500 


0.4 


13.0 


Sheff and Thorn (2004), 
Janke et al. (2004), Hailey 
et al. (2002) 


mTFPl 


Ai et al. (2006) 


462 


492 


64,000 


0.85 


54.0 


Deng et al. (2009) 


EGFP 


Tsien (1998) 


488 


507 


56,000 


0.6 


33.6 


Sheff and Thorn (2004), 
Janke et al. (2004), Deng 
et al. (2009), Longtine 
et al. (1998), Gauss et al. 
(2005), Wach et al (1997) 


Citrine 


Griesbeck et al. 

(2001) 


516 


529 


83,400 


0.76 


58.5 


Sheff and Thorn (2004), 
Deng et al. (2009) 


Venus 


Nagai et al. (2002) 


515 


529 


92,200 


0.57 


52.5 


Sheff and Thorn (2004) 


mKO/c 


Tsutsui et al. (2008) 


551 


563 


105,000 


0.61 


64.0 




mCherry 


Shaner et al. (2004) 


587 


610 


72,000 


0.22 


15.8 


Deng et al. (2009) 


TagRFP 


Merzlyak et al. (2007) 


555 


584 


100,000 


0.48 


49.0 




mPlum 


Wang et al. (2004) 


590 


649 


41,000 


0.1 


4.1 




mKate2 


Shcherbo et al. (2009) 


588 


633 


62,500 


0.4 


25.0 





An expanded table is available online at http://thornlab.org/gfps.htm. 
" Product of £ and QY, divided by 1000. 
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strongly wavelength dependent, being much less intense for longer wave- 
length imaging than shorter wavelength imaging. Because of this, orange 
and red fluorescent proteins are more detectable than would be expected 
given their intrinsic brightness. 

The photostability of the fluorescent protein is important when doing 
time-lapse imaging; more photostable proteins will bleach more slowly and 
so will allow more frames to be acquired before the signal drops to an 
unacceptable level. For acquiring single images, photobleaching rate is 
relatively unimportant. Photobleaching rates have been measured for many 
of the proteins suggested here and are available in the original publications 
and in several reviews (Shaner et ah, 2005, 2007; Zacharias and Tsien, 2006). 

Finally, of critical importance is whether or not fusion of a fluorescent 
protein to your protein of interest will perturb its function. This will of course 
depend on the protein being tagged and on where it is tagged, but it seems that 
GFP fusions to the C-terminus of proteins in yeast is generally well tolerated, 
as 87% of essential yeast genes could be successfully tagged with GFP in a 
systematic tagging effort (Howson et ah, 2005; Huh et ah, 2003). In my 
experience, other GFP variants (e.g., CFP and YFP) are equally well tolerated, 
and mCherry is well tolerated also. Many of the newer fluorescent proteins 
mentioned have not been extensively studied in the context of fusion proteins, 
and so little information is available about how they are tolerated. 

For general purpose imaging, tagging with GFP is a good place to start. It 
is readily available in a monomeric form, reasonably bright, and generally 
well behaved. For imaging a second color, mCherry has minimal cross talk 
with GFP and is also generally well behaved. For multicolor imaging, CFP/ 
YFP/mCherry is a good combination, and Sapphire can be added with the 
introduction of a small amount of cross talk (Sheff and Thorn, 2004). 
TagBFP is another possibility for a fourth color, and it may be possible to 
multiplex mKO/c for a fifth color with some small amount of cross talk. For 
imaging low- abundance proteins, it may be worth exploring less common 
options to maximize signal-to-noise ratio. In general, moving to longer 
wavelengths is advantageous as yeast autofluorescence is less intense at 
longer wavelengths. For proteins that are rapidly turned over, using a fast- 
folding protein such as Venus may help (Yu et ah, 2006). For difficult 
imaging cases, I would expect that it may be necessary to try several proteins 
before identifying an optimal tag. Finally, imaging in diploid cells may be 
preferred to imaging in haploid cells as diploid cells are larger. 

3.2. Minimizing autofluorescence 

Yeast cells are somewhat autofluorescent, and the commonly used ade2~ 
strains accumulate a highly fluorescent red pigment (Ishiguro, 1989; Stotz 
and Linder, 1990). If possible, it is best to avoid ade2~ strains, although 
accumulation of this pigment can be avoided by supplementing the growth 
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medium with 20 /ig/ml adenine. This supplementation can also be benefi- 
cial for strains that are ADE . Cells tend to be less fluorescent in early log 
phase (OD 600 ~ 0.2). Yeast media can also be autofluorescent. Rich media 
such as YPD are highly autofluorescent and should be avoided when 
imaging. Cells grown in rich media can be washed into a less fluorescent 
media or buffer prior to imaging; however, these fluorescent media seem to 
increase the autofluorescence of cells slightly even after washing. In syn- 
thetic (minimal) media, the major sources of autofluorescence are riboflavin 
and folic acid, and omitting these components eliminates the media auto- 
fluorescence if the medium is made from scratch. Most S. cerevisiae strains 
can synthesize both riboflavin and folic acid, so eliminating these vitamins 
from the medium does not appear to have drawbacks. Commercial yeast 
nitrogen base is often fluorescent even when it is lacking these two vitamins, 
so I recommend testing its fluorescence before use or making your own 
from scratch (Sheff and Thorn, 2004). Commercial CSM supplements do 
not seem to be autofluorescent and so can be safely used. Growing cells in 
this low-fluorescence medium allows direct imaging of cultures without 
washing, and substantially reduces backgrounds for cells grown on agarose 
pads or in a perfusion system. 



3.3. Mounting 

The simplest way to prepare yeast for imaging for short periods of time 
(30—60 min) is by immobilizing them on Concanavalin A coated coverslips. 
Concanavalin A binds to sugars in the yeast cell wall and will stick the yeast 
tightly to the coverslip so that they will not move during imaging. 
Concanvalin A coated coverslips can easily be prepared as follows: 

1 . Prepare Concanavalin A solution by dissolving Concanavalin A (Sigma 
#L 7647) in distilled water to 0.5 mg/ml. Refrigerate. 

2. Rack 22 mm #1.5 coverslips (Racks: Electron Microscopy Sciences 
#72241-01). 

3. Soak coverslips overnight in 1 M NaOH with gentle shaking. Sterile 
filter NaOH before use to remove dust. 

4. Pour off NaOH and save for reuse. Wash coverslips 3x with distilled 
water. 

5. Add Concanavalin A solution and soak for 20 min with gentle shaking. 

6. Pour off Concanavalin A solution and save for reuse. 

7. [Optional] Rinse coverslips once with distilled water. 

8. [Optional] Spin coverslips dry on microplate carriers in Sorvall RT7. 
Place racks on a piece of paper towel to catch liquid and spin 1 min at 
700 rpm. 

9. Place racks in hood until absolutely dry. Store in coverslip box at RT. 
Coverslips should be good for at least 1 month. 
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To use the coverslips, simply place a 6—8 fA drop of yeast cell suspen- 
sion on a slide, and drop a coated coverslip on it. The cells should stick to 
the coverslip within a few minutes. The coverslip can be sealed with 
petroleum jelly or VALAP (Vaseline: Lanolin: Paraffin, 1:1:1) for longer 
term imaging, or left unsealed for short-term studies (~10 min). If the 
cells are grown in low-fluorescence media, they can be directly placed on 
coverslips for imaging; if they are grown in more fluorescent media, 
particularly in YPD or other rich media, you will substantially reduce 
the background fluorescence by first washing them into PBS or some 
other nonfluorescent buffer or medium. Pelleting and resuspending cells 
can also be used to concentrate the cells if the cell density in the initial 
culture is not high enough to get a sufficient number of cells in the field of 
view. 

For longer term imaging (up to 4—6 h), cells can be grown on agarose 
pads containing low-fluorescent yeast medium. These can be prepared by 
dissolving 1.2% agarose in low-fluorescence SC + carbon source. The 
agarose solution is then cast in a 1-mm slab in a device used for casting 
polyacrylamide gels. If kept in a sealed container with moist paper towels, 
such a gel can be kept for roughly a week. Pieces of agarose 
(^15 X 15 mm) are then cut out with a sterilized razor blade and placed 
on a slide, a drop of yeast suspension is placed on the agarose block, and a 
coverslip is placed on top. The gap between the coverslip and slide can 
then be filled with petroleum jelly to prevent evaporation. This is easily 
done by filling a syringe with petroleum jelly and injecting it through a 
large gauge needle into the space between the coverslip and slide. Alter- 
natively, these pads can be sealed with VALAP. Agarose pads can also be 
made by filling depression slides with agarose, or sandwiching a drop of 
agarose between two slides, but I find the casting method described here 
to be the easiest. 

For even longer term imaging (overnight or longer), it is probably best 
to use a microfluidic device to provide constant nutrient flow to the cells 
and remove waste products. We have had good luck with the CellASIC 
ONIX system (www.cellasic.com). This system consists of a control 
device paired with a special microfluidic plate which has the same footprint 
as an ordinary 96-well plate. The plate contains viewing areas where cells 
can be trapped and immobilized for long-term imaging while being contin- 
uously perfused with solution. The cells are trapped and held in place by 
being loaded under pressure (6—8 psi) into a viewing area which traps the 
cells under an elastomeric membrane 4.2 /im above the coverslip. The 
downward pressure applied by the ceiling membrane holds the cells in 
place indefinitely. Media can be perfused from one of two wells, allowing 
rapid media switching. Cells have been kept in this device for up to 3 days 
and continued to divide (Lee et ah, 2008). 
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Abstract 

Correlative light and electron microscopy represents the ultimate goal for the 
visualization of cell biological processes. In theory, it is possible to combine the 
strengths of both methods, that is, the live-cell imaging of the movement of 
GFP-tagged proteins captured by fluorescence microscopy with an image of the 
fine structural context surrounding the tagged protein imaged and localized by 
immunoelectron microscopy. In practice, inherent technical limitations of the 
two individual methods and their combination make the technique very 
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complex to handle. Here, we present a high-pressure freezing and freeze- 
substitution protocol which fulfills the key criterion for correlative microscopy, 
namely, the ability to achieve excellent visibility of fine structures without 
disrupting the antigens recognized by the immunolabeling protocol. This is 
achieved by a fixative-free freeze-substitution and low-temperature embedding. 




1. Introduction 

With the ease of genetic manipulation and the availability of online 
databases on the localization patterns of GFP-fusion proteins in yeast (Huh 
et ah, 2003) our knowledge of the localization and function of yeast proteins 
at the light microscope (LM) level has increased dramatically. Also structural 
and immunolabeling studies by electron microscopy (EM) are adding data 
on the cellular architecture and the distribution of proteins on the nanome- 
ter scale. Still, the electron microscopic characterization of yeast cells is 
lagging behind when compared to tissue culture cells. This is mainly due to 
the difficulties encountered when preparing yeast for transmission electron 
microscopy (TEM), which are mainly characterized by infiltration problems 
and the inherent density of the yeast cytoplasm. Furthermore, there is a lack 
of preparation protocols yielding both good visibility of the organelles of 
interest together with immunolabeling in the same sample. The goal of the 
research presented here is to establish a protocol that ultimately should allow 
the direct correlation of live-cell fluorescent imaging of GFP-fusion proteins 
with structural- and immuno-EM data from the same cell. 

Unfortunately, key publications in the yeast field still extensively rely on 
chemical fixation protocols in spite of the multitude of known artifacts 
induced by these methods. This is especially astonishing since already in the 
late 1980s, Baba and Osumi noted the superior structure of yeast that were 
plunge frozen and freeze-substituted (Baba and Osumi, 1987; Baba et ah, 
1989). The images published in those articles still hold up to today's 
standards 20 years later. Today, high-pressure freezing (HPF) and freeze- 
substitution (FS) are on their way to becoming routine techniques in many 
EM facilities, since all the necessary equipment is commercially available 
and the know-how to successfully prepare biological samples has increased 
dramatically. 

The scope of this chapter is to update the information presented in the 
last edition (McDonald and Mueller-Reichert, 2002) and to introduce a 
new protocol to prepare yeast TEM samples with excellent structural 
visibility which are at the same time suitable for immunolabeling. The 
following paragraphs will first cover the rationale and pitfalls of the 
HPF//FS approach and the immunolabeling, with the step-by-step 
protocols at the end. 
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2. Recent Advances in High-Pressure Freezing 
and freeze-substitution 

The list of commercially available equipment for both HPF and FS has 
grown considerably in recent years, although it can be expensive to pur- 
chase and maintain. If state of the art equipment is not available, keep in 
mind that the first trials in using cryo-methods can also be done with far 
cheaper homemade equipment as mentioned in the previous edition 
(McDonald and Mueller-Reichert, 2002). 

2.1. High-pressure freezing 

Both the Bal-Tec HPM 010 and the Leica EMPact have been updated. 
Leica Microsystems now sells the EMPact2, which follows the same design 
as the EMPact (i.e., a mobile HPF machine with separated pressurizing and 
cooling lines), but now adds the option of the so-called RTS (rapid transfer 
system) designed for correlative LM/EM projects. The RTS allows the 
observation of the sample in the LM and high-pressure freezing within 
approximately 4 s (Verkade, 2008). 

A similar move toward instrument mobility and correlative microscopy has 
been made by Bal-Tec with the new HPM100. Leica Microsystems has recently 
acquired Bal-Tec and is now selling this system as well under the name Leica 
HPM 100. The machine formerly known as the Bal-Tech HPM 010 is now 
being sold by Boeckler Instruments, Inc. (Tucson, AZ) as the HPM 010. 

A new high-pressure freezer, the HPF Compact 01, has been introduced 
recently by Wohlwend Engineering (distributed by Technotrade Interna- 
tional in the USA). It operates under the same principle as the Bal-Tec 
HPM 010, but has an improved liquid nitrogen streaming and pressurizing 
system. Due to that it has significantly reduced liquid nitrogen consumption 
and an increased reproducibility in freezing. 

Both the Leica EMPact2 and the Wohlwend HPF Compact 01 
machines have a range of loading solutions for different types of specimens 
and applications. It has to be noted that the maximal sample diameter for the 
Leica EMPact is only 1.5 mm, whereas the Wohlwend HPF Compact 01 
and the Bal-Tec HPM 010 use carriers with a cavity of 2 mm in diameter. 
Ultimately, the question of what HPF to buy is a matter of preference and 
budget, since all produce well-frozen samples. 

2.2. Freeze-substitution 

The basic principle underlying FS systems is simple: A metal stage which is 
constantly cooled from below by liquid nitrogen contains a heating system 
to regulate the temperature. The temperature is monitored and the heating 
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is triggered if it drops below the desired value. Various inexpensive home- 
made devices are used in EM facilities. While for most standard FS and 
embedding procedure these systems are perfectly sufficient, handling some 
toxic resins at low temperature is tedious and usually exposes the user to 
fumes. The only commercially available FS machines are the AFS series by 
Leica Microsystems and the FS 7500 by Boeckler Instruments, Inc. Leica 
has recently added the automated AFS, in which a pipeting robot can be 
programmed to exchange the solutions, thus greatly reducing the exposure 
of the user to fumes. 

An important recent shift in thinking has also occurred in the composi- 
tion of FS media: the addition of water. Previously, water was thought to 
prevent complete FS, based on observations by Humbel and Miiller (1985). 
In their elegant quantitative assay, the addition of small amounts of water 
reduced the substitution capacity of various solvents. For long this added 
water was believed to cause a delay of the substitution process beyond the 
recrystallization temperature of the cellular water, which would lead to ice 
crystal damage. In contrast to this hypothesis stands the observation that 
adding 5% water to the FS media dramatically increases the visibility of the 
cellular fine structure (Walther and Ziegler, 2002). Intriguingly, the water 
only needs to be present at a temperature around — 60 °C during the FS to 
exert this effect (Buser and Walther, 2008). In spite of not knowing the 
exact mechanism, we now routinely and successfully add 5% water to our 
FS media for a wide range of specimens. The presence of this amount of 
water also enables us to dissolve any metal stain in acetone, which in future 
might be useful for selective staining of cellular structures. 




3. How to Prepare Yeast by HPF/FS 

In order to be imaged in the TEM, yeast cells require extensive 
processing which takes several days to complete. The only point at which 
a quality control can be made is the final observation in the TEM. This 
makes troubleshooting very difficult and time-consuming, and minor errors 
can ruin an entire batch of samples. In summary, the yeast cells have to be 
concentrated, loaded into the HPF hat, high-pressure frozen, freeze- 
substituted, embedded in resin, and sectioned. Most of these aspects have 
been covered in the previous edition of this book (McDonald and Mueller- 
Reichert, 2002). Here, we add a more demanding preparation protocol 
yielding samples with excellent structural preservation (comparable to 
heavily fixed samples) and concurrently good preservation of epitopes for 
immunolabeling. 
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3.1. Growing and concentrating yeast 

The medium used for growing the yeast is exceedingly important and 
determines the success of this preparation. Any components that are poorly 
soluble in acetone during FS should be avoided or used at low concentra- 
tions for reasons stated below. The most important of these components is 
the carbon source, that is, glucose in most cases. Accordingly, we used 
standard YPD medium containing only 1% glucose. 

A concentration step is necessary to obtain a sample that has a sufficient 
number of yeast cells per field of view in the TEM. Additionally, less water in 
the sample means better freezing quality. Ideally, the cells should be manipu- 
lated (and thus stressed) as little as possible to obtain a morphology that is as 
close to the native state as possible. A simple way is to filter approximately 
50 ml of yeast culture as described in McDonald and Mueller-Reichert 
(2002). Here, we propose an alternative filtration method using small syringe 
filters. This approach only requires 2 ml of yeast culture in log phase which is 
concentrated to 50 jA with a nylon membrane syringe filter. The yeast are 
resuspended from the filter and transferred to a 0.5-ml tube (to avoid drying 
and osmotic stress) and aspirated into dialysis capillaries as described below. 
Nylon filters are superior to regenerated cellulose because they clog less and 
the yeast are easily resuspended from the membrane. 

Concentration by brief spinning down in a centrifuge — for example, 
10 s in a tabletop centrifuge — should be avoided. Centrifugation has been 
reported to activate stress responses and depolarization of actin patches 
(Petersen and Hagan, 2005; Soto et ah, 2007). We found that after a brief 
centrifugation of 10 s the morphology of yeast cells is intact, but only very 
few endocytic sites can be found compared to filtered samples. Since actin 
patches are sites of endocytosis in yeast, this effect might be linked to the 
reported actin depolarization. 

Irrespective of the method of concentration, it is essential to prevent 
drying of the cells: filters can be transferred to agar dishes while tubes and 
syringes with suspensions should be kept closed. 



3.2. High-pressure freezing 

It is possible to transfer a very concentrated yeast paste from a filter directly 
into the freezing hats using a toothpick (the "y east cake" method). This 
technique requires practice and some amount of drying (i.e., osmotic stress) 
of the top cell layers will always occur. 

Alternatively, less concentrated yeast suspensions can be aspirated into a 
dialysis capillary and immersed in 1-hexadecene (Studer et aL, 1995) (see 
HPF protocol below). The capillary can then be cut into short pieces that 
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are loaded into the freezing hats using a 200-/mi cavity (Hohenberg et ah, 
1994). The advantage of this approach is that 1-hexadecene has good 
freezing properties and is not miscible with water. This causes the cell 
suspension to remain in the capillary during cutting and protects it from 
drying. Additionally, filling capillaries requires a less concentrated yeast 
suspension and a smaller culture volume (2—5 ml) than the yeast cake 
method (30-50 ml). 

3.3. Freeze-substitution 

Once frozen, it is vital to prevent the samples from premature warming to 
avert recrystallization. Yeast cells are reportedly easy to vitrify and crystal 
damage seen in the final sections in the TEM originates frequently from 
improper handling of the frozen samples or a problem in FS rather than 
in HPF. Accordingly, the samples should always be manipulated with 
precooled tweezers. It is also important to know the temperature variations 
of your FS setup. The temperature measured by the FS device is not 
necessarily identical with the temperature of the FS medium. Especially, 
when using a system in which the sample vials are immersed in ethanol, the 
temperature gradient along the sample tubes can be up to 15 °C from top to 
bottom (this is less of a problem if you are using a cooled metal block to hold 
your samples). Also keep in mind that when removing the tubes to transfer 
the samples the FS medium warms at a rate of 0.5— 1 °C/s. These tempera- 
ture variations are avoided easily by immersing the vial with the FS medium 
in liquid nitrogen until it freezes completely (i.e., <— 95 °C for acetone or 
< — 117 °C for ethanol), dropping the HPF hat containing the sample on 
the surface and transferring the vial back to the FS device to warm it up to 
the starting temperature. Make sure the freezing hat really is immersed in 
the FS solvent once it melts and drops to the bottom of the tube. 

A key point is the composition of the FS medium. Acetone is the most 
suitable solvent to retain the fine structure. Ethanol and methanol can be 
used as well, but appear to cause more extraction of the sample structure 
which is partially balanced by the addition of water (Buser and Walther, 
2008). A major pitfall in FS is that certain components of yeast growth 
media and also fillers used for HPF are dissolved poorly by some solvents 
during FS, especially when infiltration is done at sub-zero temperatures. 
Two combinations are important when dealing with yeast: First and most 
importantly, sugars are very poorly soluble in acetone. High concentrations 
of glucose (and probably glycerol) encase the yeast cells during FS and cause 
incomplete FS and poor infiltration of the cell periphery (Fig. 24.1). In 
extreme cases, this results in the cells being ripped out during sectioning. 
The second problem is that 1-hexadecene is poorly soluble in ethanol. 
Using hexadecane as filler to load capillaries for HPF combined with 
substitution in ethanol or methanol leads to incomplete FS and 
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Figure 24.1 Infiltration problems frequently observed in yeast grown in media con- 
taining >2% glucose. (A) Overview of a yeast culture grown in YPD and 2% glucose. 
Many cells are ripped out of section during the sectioning process (arrowhead). Those 
that remain in section show a poorly infiltrated cell periphery with obvious gaps 
(arrows). (B) Yeast cell grown in YPD and 2% glucose with characteristics of poor 
infiltration at the cell periphery and commonly the vacuole (arrows). The gaps 
observed here tend to widen during immunolabeling causing further distortions of 
the ultrastructure. (C) Yeast cell grown in YPD and only 1% glucose. Note the 
perfectly infiltrated periphery and vacuole (V). The cell wall (CW) and plasma mem- 
brane (PM) are well retained, and mitochondria (M), endoplasmic reticulum (ER), and 
endosomes (E) are clearly visible. 



recrystallization damage. If substitution in ethanol or methanol is essential 
the excess 1-hexadecene around the capillaries can either be scratched off 
carefully after HPF or different filler should be used. 
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Fixatives can be added to the FS medium to better preserve the fine 
structure. The most commonly used are osmium tetroxide or glutaralde- 
hyde, but the former is highly oxidative while the latter may cross-link and 
mask epitopes. Accordingly, both fixatives are known to interfere with 
immunolabeling. When labeling GFP-fusion proteins as described below, 
labeling was visible at a glutaraldehyde concentration of 0.1%, but was lost at 
0.4% (not shown). Since the aim of the preparation protocol presented here 
is to prepare samples optimal for immunolabeling, fixatives were omitted 
entirely in the FS mixture. To increase contrast, uranyl acetate was added to 
a final concentration of 0.1% (w/v) together with 5% (v/v) water. The FS 
temperature protocol is less critical in comparison. We suggest and initial 
incubation step at — 90 °C for a few hours, followed by a slow warming of 
3 °C/h to —60 °C and a quicker warming of 8 °C/h to — 18 °C for low- 
temperature infiltration. The duration of the initial incubation at —90 °C 
does not appear to influence the FS process and can be adjusted for 
convenience from 2 to >24 h (e.g., for FS over the weekend). Based on 
experiments with tissue culture cells the critical temperature range for 
adequate FS appears to be between —90 and —60 °C (Buser and 
Walther, 2008) and the warming rate should be kept low at this point. 
Since warmer temperatures correlate with stronger extraction, the remain- 
ing steps of FS and infiltration are performed as quickly as possible (the LR 
and Lowicryl resins are potent solvents). 

3.4. Embedding 

Based on previous observations, fixative-free FS with water requires low- 
temperature embedding to prevent disruption of the fine structure which 
appears to occur rapidly at temperatures above °C (Buser and Walther, 
2008). 

In many articles the yeast cell wall is considered a major infiltration 
barrier and accordingly the suggested infiltration times are fairly long 
considering the small size of yeast cells. Here, we propose that the actual 
infiltration barrier is formed by the aggregation of glucose around the cells 
during FS. In agreement with this hypothesis, a reduction in the glucose 
content of the medium to 1% and three washes in ethanol (which dissolves 
sugars better than acetone) before infiltration make it possible to infiltrate 
yeast cells with LR-Gold in as little as 5 h even at — 18 °C. 

Several resins suitable for low- temperature embedding and polymeriza- 
tion are available; the Lowicryl and LR resins are commonly used. The 
different Lowicryl resins offer a wide range of properties, but are very 
volatile and toxic which makes their use in nonautomated FS systems 
difficult. The LR resins LR-Gold and LR- White are considered less toxic 
and are thus a good choice for low-temperature embedding with manual 
solution exchanges. LR-Gold is specially designed for low-temperature UV 
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polymerization down to — 20 °C, while LR- White is commonly polymer- 
ized at higher temperatures, usually around room temperature. Still, LR- 
White will also polymerize under UV at temperatures down to at least 
— 10 °C if a UV catalyst is added. In our hands, the main difference is the 
better visibility of cellular structures in LR-Gold, which is an important 
aspect for the application proposed here. Samples embedded in LR-Gold 
generally show a well-preserved ultrastructure, including filaments which 
probably represent f-actin (Fig. 24.2). 



3.5. Sectioning 

Both LR-resins are more brittle and difficult to section than standard Epon 
blocks. A common problem is the wetting of the block face during section- 
ing, which can be reduced by lowering the water level in the knife trough. 
In addition, the resins often do not bond strongly with the dialysis capil- 
laries, which can sometimes cause the sections to split on the water surface. 
Still, the ease in handling of the capillaries during HPF outweighs this 
disadvantage. 




lOOnm 



Figure 24.2 High-magnification view of the tip of a small-budded yeast cell showing 
the high structural quality of the samples obtained with this preparation protocol. Two 
ER cisterns (ER) are extending toward the bud tip and a secretory vesicle is visible 
(arrowhead). Further magnification of the boxed region reveals weakly visible filaments 
(arrows), probably actin. 
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4. Immunolabeling 



The aim of our protocol is to label GFP-fusion proteins by immuno- 
EM and to correlate the ultrastructural data with dynamic data acquired by 
live-fluorescence microscopy (Fig. 24.3). The ideal sample for the localiza- 
tion of proteins by immuno-EM possesses both an excellent visibility of 
cellular structures and a good retention of epitopes to be bound by the 
antibody. The dilemma is that good structural preservation usually requires 
strong fixation, while the retention of epitopes asks for as little fixation as 
possible. The aim of any protocol thus has to be to find the optimal middle 
way. The use of low- temperature embedding enables us to completely 
avoid fixatives while still retaining the fine structure. Low-temperature 
embedding is especially crucial for this approach, since the transition to 
room temperature causes strong extraction in weakly fixed samples. We 
chose LR-Gold as resin because it can be polymerized at — 18 °C and is less 









PM 
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Figure 24.3 Correlation of live-fluorescence images of a yeast strain expressing Abpl- 
GFP with immuno-EM data after anti-GFP labeling. (A) Live-fluorescence image of 
Abpl-GFP expressing yeast cells. The actin-binding protein Abpl is involved in the 
final steps of endocytosis and remains on the primary endosomes for several seconds 
after internalization (Kaksonen et ah, 2003). (B) Anti-GFP immuno-EM on sections of 
Abpl-GFP expressing yeast cells. The label is visible in small clusters within approxi- 
mately 1 /im of the plasma membrane. The extra- and intracellular background was 
negligible. The ultrastructure is slightly degraded during the labeling process, probably 
due to a combination of section swelling and extraction of the unfixed cellular material 
(compare with Figs. 24.1 C and 24.2). (C) Magnification of the boxed area reveals that 
the gold clusters around small spherical structures (arrows), which probably represent 
tangentially sectioned primary endosomes that are still surrounded by a cloud of Abpl. 



GFP-IEM by HPF, FS and Low Temperature Embedding 613 

toxic than the comparable Lowicryl resins. Furthermore, the visibility of 
fine structures is better in LR-Gold than in the similar LR- White. The 
samples were freeze-substituted in acetone containing 0.1% uranyl acetate 
and 5% water, low- temperature embedded in LR-Gold at —18 °C and 
labeled by a two-step immuno-EM protocol. While the yeast fine structure 
was well retained in unlabeled sections, the longer incubation in the 
blocking and labeling solutions caused some deterioration of the sections 
especially in poorly infiltrated cells. This was probably due to a combination 
of swelling and extraction of the unfixed cellular material. Still, the struc- 
tural quality of the samples is sufficient to detect Abpl-GFP-labeling 
(Kaksonen et ah, 2003) of fine structures like primary endosomes with 
negligible background (Fig. 24. 3B). The difference in the structural detail 
visible in Figs. 24.2 and 24. 3B is a consequence of the above mentioned 
extraction during labeling of the sections, since extended on-section staining 
with uranyl acetate and lead citrate does not improve the visibility of the 
vesicles. This could possibly be improved by either reducing the incubation 
times with the antibody, a stronger cross-linking of the resin, or the addition 
of low amounts of fixatives to the FS mixture (e.g., <0.1% glutaraldehyde 
or osmium tetroxide). Unfortunately, all those changes can decrease the 
labeling efficiency and need to be optimized individually for every epitope— 
antibody combination. By generally omitting fixatives the present protocol 
is a good starting point for this endeavor. Furthermore and in contrast to 
mammalian tissue culture cells, the high density of the yeast cytoplasm 
often obscures fine structures and poses a major challenge in the study of 
endocytic trafficking in yeast. 




5. Conclusions 

The preparation scheme presented here aims at preserving yeast cells 
in a close-to-native state and to allow on-section immunolabeling with 
good visibility of their fine structure. The immunolabeling of GFP is of 
particular interest due to the vast number of GFP-tagged proteins being 
studied by fluorescence microscopy and because it allows standardization of 
the immunolabeling protocol for different GFP-tagged proteins. Impor- 
tantly, the fixative-free approach presented here combines excellent fine 
structural preservation and retention of epitopes for immunolabeling in one 
sample, allowing a direct correlation of protein distributions with the 
cellular ultras tructure. The most damage to the ultrastructure is apparently 
done during the immunolabeling process when the sections are immersed in 
aqueous buffers for approximately 1 h. It remains to be seen if shorter 
labeling times can improve the structure without reducing the signal. 
Alternatively, the damage can also be avoided by serial sectioning and 
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using complementary sections for ultrastructural and labeling purposes, 
respectively. Nevertheless, it cannot be overemphasized that the key aspects 
for successfully preparing yeast lie in the growth and the handling of the cells 
before HPF, that is, low sugar content in the growth medium and preven- 
tion of osmotic and centrifugal stress. 

The ultimate goal in correlative microscopy of biological systems is 
to directly correlate the movement of GFP-tagged proteins in living cells, to 
arrest them in a defined state and process the same cell for immuno-EM to 
reveal the distribution of the GFP-tagged component within the fine struc- 
tural context. The protocol presented here fulfills the last of the above 
mentioned conditions, namely the ability to immunolocalize a protein while 
still preserving a clearly visible cellular ultrastructure. Still, to realize true 
correlative microscopy (i.e., observation of the same cell by both LM and 
EM), several technical aspects of our preparative protocol need to be adapted. 
Most importantly, a sample holder system has to be developed to meet 
restrictions imposed by both the fluorescence microscopy (e.g., transparency 
to light) and the freezing method (e.g., good cooling rates). While the Leica 
EMPact2-RTS meets these criteria, the time resolution is too low (4—5 s) for 
some of the endocytic events that interest us. 

In summary, we show that correlative immuno-EM is technically 
achievable without sacrificing ultrastructural resolution or epitopes. The 
ability to label GFP in such a way also allows us to build a GPF-immuno- 
EM localization database similar to the database available for fluorescence 
microscopy (Huh et ah, 2003). 

6. Protocols 

The following protocols describe the process and materials needed for 
high-pressure freezing with a Wohlwend HPF Compact 01 and freeze- 
substitution in a Leica AFS. 



6.1. Filtration and HPF 

1. Special materials needed: 

— syringe filters, 4 mm diameter, 0.45 /im pore size, nylon membrane, 
National Scientific, Cat# F2504-1 

— 3 ml syringes with Luer-Lok tip 

— dialysis capillaries, Spectra/Por RC, MWCO 13,000, 200 /im i.d., 
Spectra/Por, Cat#132 290 

— Number 10 scalpel (curved blade) 

— 1-hexadecene, Sigma- Aldrich 

— HPF platelets, recess 0.3 mm (Cat# 242) and recess 0.1/0.2 mm 
(Cat# 241), Technotrade International 
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2. Cut the dry dialysis capillary to pieces of approximately 4 cm in length 
with a scalpel. Fill a 6-cm plastic dish with 1-hexadecene so that the HPF 
platelets are fully immersed. Turn the platelets with the 0.2 and 0.3 mm 
cavity upward, respectively. 

3. Pull up 2 ml yeast culture (OD 600 = 0.2—0.5) in the syringe and filter it 
through the syringe filter until the entire volume is filtered or the filter 
clogs. Remove the syringe filter and carefully resuspend the yeast stuck 
on the filter in the remaining 50 jA of medium. Transfer the suspension 
to a 0.5-ml microcentrifuge tube and drop in the dialysis capillary so that 
the suspension is pulled in by capillary action. Immerse the filled capil- 
lary in 1-hexadecene and cut it into >2 mm short pieces with a scalpel 
(curved blades work best). Carefully fill a 0.2-mm cavity platelet with 
3—5 pieces using a fine tipped forceps, then transfer the platelet into the 
HPF sample holder. Make sure there are no air bubbles and cover the 
cavity with the other platelet (flat side down), close the sample holder 
and freeze in the HPF. 

4. Store the samples in cryovials under liquid nitrogen. 



6.2. Freeze-substitution 

1 . Special materials needed: 

— EM-grade acetone 

— EM-grade ethanol 

— 0.5 ml microcentrifuge tubes 

— 2% (w/v) uranyl acetate (UA) in distilled water 

2. Dilute the 2% aqueous UA 1:20 in acetone to prepare the final FS 
medium containing 0.1% UA and 5% water in acetone. Aliquot 0.5 ml 
per microcentrifuge tube (labeled with pencil, as permanent markers 
easily wash off in the ethanol bath) . 

3. Fill the AFS with liquid nitrogen and program as follows (see above for 
modifications): start at — 90 °C for 4 h; warm at a rate of 3 °C/h; hold at 

— 60 °C for 2 h; warm 8 °C/h; hold at — 18 °C for 72 h (duration of the 
actual FS is only 24 h). Start the program and hit "pause" to precool to 

— 90 °C. Fill the AFS cups halfway with ethanol and allow them to cool 
for approximately 10—15 min. 

4. Separate the two HPF platelets enclosing the capillaries under liquid 
nitrogen. Take one microcentrifuge tube filled with the FS medium, 
open it and immerse it in the liquid nitrogen until it freezes up 
completely (this makes sure that your FS medium is —95 °C when it 
melts so that the sample is not warmed prematurely) . Transfer the HPF 
platelet containing the cells (i.e., the 0.1/0.2 mm platelet) on top of the 
frozen FS medium. Make sure any drops of liquid nitrogen that may 
have spilled in the tube have evaporated (otherwise the microcentrifuge 
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tubes can explode!), close the tube and put it in the AFS cup (it is useful 
to make a small rack that fits in the cups). Repeat with all samples and 
start the AFS by hitting "pause" again. 
5. Used HPF platelets can be cleaned in acetone and reused multiple times 
until they show signs of deformation such as bulging. 



6.3. Embedding 

1. Special materials needed: 

— Wheaton Snap-Cap Specimen Vials, Ted Pella, Inc. 

— EM-grade ethanol 

— LR-Gold resin 

— benzoyl peroxide (thermal initiator) 

— benzoin methyl ether (UV initiator) 

2. Fill snap cap vials with ethanol (1.5 ml per FS sample). Mix 50% (v/v) 
LR-Gold (without initiators) in ethanol (0.5 ml per FS sample) in snap 
cap vials. When mixing LR-Gold always exclude air by flushing with 
dry nitrogen gas and mix by gentle pipeting (avoid bubbles). Precool the 
vials in the AFS for > 20 min. 

3. When the samples have reached — 18 °C gently wash them three times 
with the precooled ethanol and remove the HPF platelets with a forceps. 
The capillaries should fall out by themselves. If not, gently shake the 
microcentrifuge tube. Do not let the sample become dry, always leave a 
layer of liquid when exchanging solutions. 

4. Infiltrate with the 50% LR-Gold in ethanol for 2 h at — 18 °C. 

5. Prepare and precool 100% LR-Gold (without initiators) and infiltrate at 
-18°Cfor2h. 

6. Prepare and precool 100% LR-Gold with 0.1% (w/w) benzoyl peroxide 
and 0.1% (w/w) benzoin methyl ether (mix by gentle pipeting, avoid air 
bubbles). Infiltrate at — 18 °C for 1 h, partially cover the top of the tubes 
with aluminum foil (for indirect UV), then mount the UV lamp and 
polymerize over night at — 18 °C. 

7. The next day check if the samples polymerized completely. Any unpo- 
lymerized resin with a reddish color came in contact with oxygen and 
will not polymerize. Remove the aluminum foil and let the AFS reach 
room temperature under UV for 3—4 h to make sure the polymerization 
completed. Remove any unpolymerized resin and leave the samples on a 
window sill for 1 day to degas. 

8. Remove the microcentrifuge tube with a razor blade and proceed to 
sectioning. 
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6.4. Anti-GFP immunolabeling 

1 . Special materials needed: 

— formvar-filmed TEM copper mesh grids 

— blocking buffer (0.2% BSA, 0.2% fish skin gelatin in PBS) 

— 0.5% glutaraldehyde (EM grade) in distilled water 

— anti-GFP goat primary antibody, Fitzgerald Industries, Cat# 70R- 
GG001 

— rabbit antigoat 10 nm-gold secondary antibody, Ted Pella, Cat# 15796 

2. Pick up 70 nm thin sections on formvar-filmed copper TEM grids and 
allow to dry. 

3. Prepare a moist chamber: place a piece of wet paper in your staining 
dish (either using staining molds or drops on parafilm). 

4. Block unspecific binding sites by incubating the sections on a drop of 
blocking buffer (BB) for 5 min. 

5. Incubate with primary antibody (1:100 in BB) for 30 min. 

6. Wash on five drops of BB. 

7. Incubate with secondary antibody (1:50 in BB) for 30 min. 

8. Wash on five drops of PBS. 

9. Fix on 0.5% glutaraldehyde for 5 min. 

10. Wash on three drops of water. 

1 1 . Stain with 2% UA in water and Reynolds lead citrate as needed. 
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Abstract 

Eukaryotic cells contain at least two types of cytoplasmic RNA-protein (RNP) 
granules that contain nontranslating mRNAs. One such RNP granule is a P-body, 
which contains translationally inactive mRNAs and proteins involved in mRNA 
degradation and translation repression. A second such RNP granule is a stress 
granule which also contains mRNAs, some RNA binding proteins and several 
translation initiation factors, suggesting these granules contain mRNAs stalled 
in translation initiation. In this chapter, we describe methods to analyze 
P-bodies and stress granules in Saccharomyces cerevisiae, including proce- 
dures to determine if a protein or mRNA can accumulate in either granule, 
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if an environmental perturbation or mutation affects granule size and number, 
and granule quantification methods. 




1. Introduction 

An important aspect of the control of gene expression is the range of 
degradation and translation rates of different mRNAs. Recently, evidence 
has begun to accumulate that the control of translation and mRNA degra- 
dation can involve a pair of conserved cytoplasmic RNA granules 
(Fig. 25.1). One class of such RNA granules is the cytoplasmic processing 
body, or P-body. P-bodies are dynamic aggregates of untranslating mRNAs 
in conjunction with translational repressors and proteins involved in dead- 
enylation, decapping and 5' to 3 7 exonucleolytic decay (Parker and Sheth, 
2007). P-bodies and the mRNPs assembled within them are of interest for 
several reasons. They have been implicated in translational repression 
(Coller and Parker, 2005; Holmes et ah, 2004), normal mRNA decay 
(Cougot et ah, 2004; Sheth and Parker, 2003), nonsense-mediated decay 
(Sheth and Parker, 2006; Unterholzner and Izaurralde, 2004), miRNA- 
mediated repression in metazoans (Liu et ah, 2005; Pillai et ah, 2005), and 
mRNA storage (Bhattacharyya et ah, 2006; Brengues et ah, 2005). At a 
minimum, P-bodies serve as markers that are proportional to the concen- 
tration of mRNPs complexed with the mRNA decay/translation repression 
machinery and may have additional biochemical properties that affect the 
control of mRNA translation and/or degradation. 

Stress granules are a second mRNP granule implicated in translational 
control, and have been extensively studied in mammalian cells; for reviews, 
see Anderson and Kedersha (2006, 2008). Stress granules are generally not 
observable under normal growth conditions in yeast or mammalian cells and 
greatly increase in response to defects in translation initiation including 
decreased function of eIF2 or eIF4A (Dang et ah, 2006; Kedersha et ah, 
2002; Mazroui et ah, 2006). Because stress responses often involve a tran- 
sient inhibition of translation initiation, stress granules accumulate during a 
wide range of stress responses. Stress granules have been argued to function 
as "triage" centers for mRNAs exiting polysomes during stress, wherein 
mRNAs are either sorted to P-bodies for decay, maintained in a stored 
nontranslating state, or returned to translation (Anderson and Kedersha, 
2006, 2008). 

Recent results have shown that mRNP granules similar to mammalian 
stress granules can form in budding yeast. This was first suggested by the 
observation that the translation initiation factors eIF4E, eIF4G, and Pablp, 
components of mammalian stress granules, formed foci in yeast during 
glucose deprivation and high OD conditions, which could either colocalize 
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Figure 25.1 P-bodies and stress granules under different growth and induction con- 
ditions. (A) Mid-log wild-type cells (yRP840), transformed with pRPl 660. Dcp2-mCh 
serves as the P-body marker. Note: Very faint and large foci in P-body column (A and B) 
are infact vacuolar autofluorescence. (B) Mid-log wild-type cells (yRP840), trans- 
formed with pRP1660, and subject to 10 min -Glu deprivation stress. Dcp2-mCh 
serves as the P-body marker. (C) High OD wild-type cells (BY4741), transformed 
with pRP1657; 2 days growth in minimal media. Edc3-mCh serves as the P-body 
marker. Note: Pabl foci in high OD may not be directly equivalent to mid-log stress 
granules (see main text). 

with or be distinct from P-bodies (Brengues and Parker, 2007; Hoyle et al, 
2007). These stress granules, also called EGP bodies, also contain mRNAs 
(Hoyle et al, 2007). Further evidence that these EGP bodies or yeast stress 
granules are equivalent to mammalian stress granules is that they contain the 
yeast orthologs of several proteins seen in mammalian stress granules 
(Table 25.2) and share similar rules of assembly (Buchan et al, 2008). Such 
assembly rules include a requirement for nontranslating mRNA, stimulation 
by decreased functional levels of the translation initiation factor eIF2, and a 
requirement for similar protein assembly factors (Buchan et al, 2008). 
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P-bodies and stress granules interact and are often docked in mammalian 
cells, whereas in budding yeast they predominantly overlap (Buchan et al, 
2008; Kedersha et al, 2005). Budding yeast provides a good system for 
analyzing P-body and stress granule interactions as assembly of both granules 
can be prevented or modified in various mutant strains (Buchan et al, 2008; 
Coller and Parker, 2005; Decker et al, 2007; Teixeira and Parker, 2007). 
Indeed, such experiments have suggested that stress granule formation in 
some cases is enhanced by preexisting P-bodies, suggesting a functional 
relationship between the two. 

Analysis of mRNA turnover and translational repression can involve 
examining aspects of both P-body and stress granule composition and 
assembly, given the concentration of mRNAs, decay factors, translational 
repressors and initiation factors in these granules. In this chapter, we 
describe methods to analyze P-bodies and stress granules in the budding 
yeast, Saccharomyces cerevisiae. We focus on describing methods to address 
three common questions: (a) Does a given protein or mRNA accumulate in 
P-bodies or stress granules? (b) Does a specific perturbation (e.g., mutation, 
overexpression, or environmental cue) qualitatively change the size or 
number of P-bodies or stress granules? (c) Is there a quantifiable change in 
the number and size of P-bodies or stress granules in a given population of 
cells? 




2. Determining If a Specific Protein can 

Accumulate in P-Bodies or Stress Granules 

2.1. Markers of P-bodies and stress granules 

A common experimental goal is determining if a given protein accumulates 
in P-bodies or stress granules. Previous work has identified many proteins 
enriched in yeast P-bodies (Table 25.1). These include a conserved core of 
proteins found in P-bodies from yeast to mammals that consists of the 
mRNA decapping machinery. Core yeast P-body components include 
the decapping enzyme, Dcpl/Dcp2, the activators of decapping Dhhl, 
Patl, Scd6, Edc3, the Lsml— 7 complex, and the 5 7 — 3 7 exonuclease, Xrnl. 
Some proteins observed in yeast P-bodies, such as proteins involved in 
nonsense-mediated decay (Sheth and Parker, 2006), are only observed 
in P-bodies under certain mutant, cell-type, stress, or overexpression con- 
ditions (Table 25.1). These proteins may normally rapidly transit through 
P-bodies, but under some conditions accumulate to detectable levels. 

Characterization of yeast stress granule composition is at a more nascent 
state than that of P-bodies; nonetheless a number of factors have been 
identified including the translation initiation factors eIF4Gl, eIF4G2, 
eIF4E, and Pabl. Additional components include orthologs of factors 
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Table 25.1 Protein components of P-bodies 



Core components 


Function 


Dcpl 


Decapping enzyme subunit fl 


Dcp2 


Catalytic subunit of decapping 




enzyme^ 


Dhhl 


DEAD box helicase required for 




translational repression; decapping 




activator* 


Edc3 


Decapping activator/P-body 




assembly factor ' c 


Lsml-7 


Sm-like proteins involved in 




decapping^ 


Patl 


Decapping activator and 




translational repress or a 


Scd6 


Protein containing Sm-like and FDF 




domain; involved in translation 




repression 


Xrnl 


5 f to 3' exonuclease^ 


Ccr4/Pop2/Notl-5 


Major cytoplasmic deadenylase a,e 


Proteins involved in NMD function 




Upfl 


ATP-dependent helicase required 




for NMD, accumulates in yeast 




P-bodies in dcpl A, xrnl A, dcp2A, 




upf2A, and upf3A strains 7 


Upf2 


Component required for NMD, 




accumulates in P-bodies in dcpl A, 




dcp2A, and xrnl A mutants 7 


UpB 


Component required for NMD, 




accumulates in P-bodies in dcpl A, 




dcp2A, and xrnl A mutants 7 


Ebsl 


Putative ortholog of human Smg7, 




accumulates in P-bodies during 




glucose deprivation 5 


Translation and translational repression function 


Cdc33 


eIF4E: mRNA m7G cap binding 




protein ,l 


Pabl 


Binds poly(A) sequences; promotes 




translation initiation and mRNA 




stability ' l 


Ngrl/Rbpl 


RNA-binding protein, localizes to 




P-bodies during stress 7 


Sbpl 


Facilitates mRNA decapping 


Tif4631 


eIF4Gl: component of eIF4F 




initiation factor ,l 



(continued) 
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Table 25.1 


(continued) 




Core components 


Function 


Tif4632 




eIF4G2: component of eiF4F, 
induced at stationary phase ,l 


Dedl 




DEAD box helicase implicated in 
translational control 


Additional 


' components function 




Dsc2 




Nutrient stress-dependent regulator 
of the scavenger enzyme Dcsl" 


Pbyl 




Putative tubulin tyrosine ligase n 


Rpb4 




Subunit of RNA polymerase 11° 


Rpm2 




Protein component of the 
mitochondrial RNaseP p 


Vtsl 




May recruit Ccr4— Pop2 
deadenylation complex to 
mRNAs; accumulates in P-bodies 
in xrnlA strains^ 



Sheth and Parker (2003). 

Kshirsagar and Parker (2007). 

Decker et al. (2007). 

Barbeetf */. (2006). 

Muhlrad and Parker (2005). 

Sheth and Parker (2006). 

Luke et al. (2007). 

Brengues and Parker (2007). 

HoyleefaZ. (2007). 

Jang et al. (2006). 

Segal etal. (2006). 

Beckham et al. (2008). 

Malys and McCarthy (2006). 
" Sweet ef a/. (2007). 
Lotan etal. (2005). 
p Stribinskis and Ramos (2007). 
q Rendl etal. (2008). 



m 



implicated in mammalian stress granule assembly, namely Publ (TIA-1), 
Ngrl/Rbpl (TIA-R), and Pbpl (Ataxin-2; see Table 25.2 for a complete 
list), all of which have been implicated in regulation of mRNA stability and 
translational control of specific yeast mRNAs (Buu et al., 2004; Duttagupta 
et al., 2005; Ruiz-Echevarria and Peltz, 2000; Tadauchi et al., 2004; 
Vasudevan et al., 2005). 

To date, all proteins identified in yeast stress granules, which we define as 
foci distinct from P-bodies (as visualized by Edc3- or Dcp2-mCh), are also 
seen to partially or predominantly overlap with P-bodies during stress 
(Fig. 25.1; Buchan et al., 2008), and therefore are sometimes classified as 
being present in P-bodies. This is due to the overlap of P-bodies and stress 
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Table 25.2 Protein components of stress granules 



Core 

components Function 

Pabl Binds poly (A) sequences; promotes translation initiation and 

mRNA stability*' 6 
Tif4631 eIF4Gl: component of eIF4F initiation factor^ 

Tif4632 eIF4G2: component of eiF4F, induced at stationary phase*' 

Cdc33 eIF4E: mRNA m7G cap binding protein*' 

Publ Poly(A)/(U) binding protein; stabilizes mRNAs; possess 

QN domain (TIA-1 ortholog) c 
Ngrl/Rbpl RNA binding protein; destabilizes mRNAs (TIA-R 

ortholog) c 
Pbpl Polyadenylation regulator; translational regulator (Ataxin-2 

ortholog) c 
Eapl eIF4E binding protein; role in maintaining genetic stability^ 

Gbp2 Poly(A) binding protein; mRNA export role c 

Hrpl Cleavage factor I subunit; mRNA 3 r end cleavage and 

polyadenylation^ 
Nrpl Putative RNA binding protein c 

Ygr250c Putative RNA binding protein c 



Brengues and Parker (2007). 
b Hoyle etal. (2007). 
c Buchan etal. (2008). 



granules in budding yeast and highlights the difficulty in unambiguously 
defining a protein as solely being present in P-bodies or stress granules. 
More realistically, these types of observations suggests that there is a contin- 
uum of mRNP states between P-bodies and stress granules as individual 
mRNAs in one biochemical state exchange proteins and remodel into the 
predominant state accumulating in a different granule. Indeed, stress gran- 
ules which are distinct from bright P-bodies can often exhibit a very faint 
microscopic signal for either Edc3 or Dcp2-mCh, which often fades with 
time during stress (Buchan et aL, 2008). However, the relative abundance of 
Dcp2 and Edc3 in "P-body distinct" stress granules is extremely low relative 
to their concentration in P-bodies, as judged by microscopic intensity 
measurements. 

To determine if a given protein can accumulate in P-bodies or stress 
granules, one simply needs to examine the subcellular distribution of the 
protein relative to known P-body or stress granule components. The sim- 
plest way to do this is to tag the protein of interest with a fluorescent 
protein, and then examine its localization relative to another fluorescent 
protein-tagged P-body or stress granule component. For this type of 



626 J. Ross Buchan et al. 

experiment, many of the core P-body and stress granule components are 
available as fusions to fluorescent proteins on yeast plasmids (Table 25.3). 
Alternatively, P-bodies can be visualized in fixed cells by standard immu- 
nofluorescent methods using antisera against specific components (Gaillard 
and Aguilera, 2008). To our knowledge, no one has attempted to detect 
yeast stress granules via such methods. To examine if a protein accumulates 
in yeast P-bodies or stress granules using a fluorescent fusion protein one can 
take the following steps: 

(a) Obtain a fluorescent tagged version of the protein of interest. Note that 
most of the yeast ORFs fused to GFP are available from a genomic 
collection (Huh et al, 2003) and can be purchased from Invitrogen. Use 
of the native promoter in fusion strains integrated into the genome 
should avoid mislocalization due to overexpression of the tagged 
protein. 

(b) Determine if the fusion protein is functional by some criteria as it is 
problematic to interpret the localization of nonfunctional proteins. 

(c) Compare the localization of the tagged protein of interest with a core 
P-body/stress granule marker tagged with a different fluorescent 
protein as described below. 

In our experience, the most reliable and easily visualized components of 
yeast P-bodies are Edc3 and Dcp2, whereas Pabl and Publ serve as good 
stress granule markers (Buchan et al, 2008). Use of more than one marker 
for P-bodies and stress granules is good practice as specific mutants or stress 
conditions can specifically affect individual components of P-bodies and 
stress granules while not necessarily perturbing overall granule assembly 
(Teixeira and Parker, 2007). 



2.2. Preparation of samples 

To determine if a protein can accumulate in P-bodies or stress granules, we 
recommend examining its subcellular distribution in mid-log phase and also 
in stress conditions such as glucose deprivation, heatshock, hyperosmotic 
stress, high OD, or when decapping is inhibited (Buchan et al, 2008; Grousl 
et al, 2009; Sheth and Parker, 2003; Teixeira et al, 2005). While all of these 
conditions increase P-bodies to varying extents, glucose deprivation and 
heatshock are the only well-characterized stress conditions identified which 
strongly induce formation of stress granules, although high OD typically 
induces 1—2 large Pabl foci which are distinct from bright P-bodies and may 
be similar to stress granules (see pi 6 and Fig. 25. 1C). Additionally treatment 
with 0.5% (v/v) sodium azide for 30 min also induces stress granule-like 
foci which require nontranslating mRNA, and compositionally resemble 
glucose deprived stress granules (unpublished data). 
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Table 25.3 Core P-body and stress granule components available on plasmids as 
fluorescent protein fusions 





Fluorescent 


Plasmid 


Lab plasmid 


Protein 


tag 


marker 


number 


Dcp2 full-length 


GFP 


LEU2 


pRP1175 a 




GFP 


TRP1 


pRP1316 & 




GFP 


URA3 


pRP1315 c 


Dcp2 full-length 


RFP 


LEU2 


pRP1155 J 




RFP 


TRP1 


pRP1156 e 




RFP 


URA3 


PRP1186 7 


Dcp2 truncated 


RFP 


LEU2 


pRP1167^ 


(1-300) 


RFP 


TRP1 


pRP1165* 




RFP 


URA3 


pRP1152* 


Dhhl 


GFP 


LEU2 


pRP115F 


Edc3 


mCherry 


URA3 


pRP1574' 




mCherry 


TRP1 


PRP1575* 


Lsml 


GFP 


LEU2 


pRP1313 a 




GFP 


URA3 


pRP1314 7 


Lsml 


mCherry 


LEU2 


pRP1400 fe 


Lsml 


RFP 


URA3 


PRP1084' 




RFP 


LEU2 


PRP1085' 


Patl 


GFP 


URA3 


pRP1501 m 


Pabl 


GFP 


URA3 


pRP1362 n 




GFP 


TRP1 


pRP1363 n 


Publ 


mCherry 


URA3 


pRP166f 




mCherry 


TRP1 


pRP1662* 


Pabl + Edc3 


GFP/ 

mCherry 


URA3 


pRP1657 f 




GFP/ 


TRP1 


pRP1659' 




mCherry 






Pabl + Dcp2 


GFP/ 

mCherry 


UPJV3 


pRP1658' 




GFP/ 


TRP1 


pRP1660' 




mCherry 







a Coller and Parker (2005). 
b Segal etal. (2006). 

Unpublished, Parker Lab. 

Teixeira et al. (2005). 
e Sheth and Parker (2006). 
■* Teixeira et al. (2005). 
g Sweet etal (2007). 
h Muhlrad and Parker (2005). 
'' Buchanetal. (2008). 
j Tharunefa/. (2005). 
k Beckham et al. (2007). 
1 Sheth and Parker (2003). 
m Pilkington and Parker (2008). 

Brengues and Parker (2007). 
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It is important to be careful when preparing cells for examining P-bodies 
or stress granules microscopically as both are dynamic and can change 
rapidly in response to a variety of stresses (Buchan et ah, 2008; Teixeira 
et ah, 2005). If this is a serious issue in an experiment one could in principle 
use fixed cells although at this time, no one has detected stress granules in 
yeast by standard immunofluorescence methods. Some tips to consider in 
this type of experiment are as follows: (i) In general, cells that are growing 
vigorously prior to the onset of a stress tend to show greater inductions of 
P-bodies and stress granules, therefore ensuring optimal aeration and tem- 
perature of cultures is important, (ii) Care should be taken to reduce 
centrifugation and handling times as variations can alter P-bodies and stress 
granules due to their rapid dynamics, (iii) Media conditions may alter 
P-body and stress granule composition, for example, different results can 
be obtained from growth in rich versus minimal media for noncore pro- 
teins, (iv) Finally, not all lab strains of S. cerevisiae should be assumed to 
behave equally — for example, we have observed that our lab strain yRP840 
(cross of S28CC and A364A; Hatfield et ah, 1996) induces brighter and 
slightly larger stress granules than BY4741 during glucose deprivation, 
while simultaneously exhibiting a lower level of P-bodies than BY4741 
during normal mid-log growth conditions (Buchan et ah, 2008). Detailed 
protocols for examining P-bodies and stress granules in mid-log cultures, 
=b glucose deprivation stress, are described in the following sections. 

2.2.1. Examination of P-bodies and stress granule in mid-log 
glucose-deprived cells 

1. In a 50-ml conical flask, placed in a 30 °C shaking water bath, grow a 
5 ml yeast culture in YPD* or minimal media as appropriate to mid-log 
phase, with an absorbance between 0.3 and 0.5 at 600 nm. *YPD is not 
suitable for resuspension of cells prior to microscopic examination due to auto- 
fluorescence in the media in the GFP channel. 

2. Decant 1—1.5 ml of culture into a microfuge tube, and centrifuge at 
approximately 13,000 rpm for 30 s. 

3. Remove media without disturbing the cell pellet, and resuspend in 
1—1.5 ml minimal media supplemented with the same amino acids that 
the cells were originally grown in (assuming original culture was in 
minimal media, otherwise a complete mix for YPD grown cultures is 
recommended). This media can either contain glucose at the original 
concentration (typically 2%), which acts as a negative control for P-body 
and stress granule induction, or lack glucose entirely, which should 
inhibit translation initiation and hence induce P-bodies and stress 
granules. 

4. Repeat steps 2 and 3 to wash out residual glucose, then decant cells into a 
fresh 50-ml flask and return to shaking at 30 °C for 10 min. 
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5. Repeat steps 2 and 3, then concentrate cells in approximately 40—80 jA 
of minimal media ± glucose as appropriate. Keep in a microfuge tube 
with constant flicking by hand to maintain aeration. 

6. Add a small volume of the suspension to the slide at the microscope, then 
place a coverslip on the sample, press and wipe down gently with a 
kimwipe to remove excess volume under the coverslip (prevents cells 
movement), then examine immediately. When comparing live cell 
samples, be sure to be consistent about the time period the cells are 
examined on the microscope. This can be a serious issue as we observe 
that cells under a coverslip eventually induce a stress response, probably 
due to lack of aeration, thereby artifactually increasing the presence of 
P-bodies. In + glucose treated cells, however, this artifactual P-body 
induction is not as strong as that of glucose deprivation induced P-bodies 
in wild-type cells, but may nonetheless cloud subtle phenotypic differ- 
ences between strains. Almost no stress granule protein foci are observed 
in + glucose cells that sit on slides for elongated periods, even up to 
45 min. 

If a protein shows substantial overlap in its subcellular distribution with 
core P-body or stress granule proteins, it can be inferred to be a P-body or 
stress granule component. Naturally, other methods such as coimmunopre- 
cipitation or yeast 2— hybrid interaction of your protein of interest with a 
P-body or stress granule marker potentially offer additional support for such 
conclusions. 

2.2.2. Additional tips for optimal yeast fluorescence microscopy 
of P-bodies and stress granules 

1. Capture images rapidly to allow accurate assessment of the P-body and 
stress granule state of the cells. As at least two channels need to be taken 
to colocalize the P-body/stress granule markers with the protein of 
interest, it is optimal to use a microscope that splits the beam to record 
two/three channels simultaneously. If this is not possible, attempt to 
reduce the time between channel images as much as possible. Yeast 
P-bodies and stress granules can move slightly from their original 
location within a period of a few seconds. 

2. If cells are moving on the slide, which can make colocalization difficult, 
an alternative is to immobilize the cells by coating the coverslip s with the 
lectin concanavalin A, which binds to the yeast cell wall. The protocol 
we use to coat coverslips is to wash coverslips overnight in sterile filtered 
1 MNaOH. After washing well with distilled water, add concanavalin A 
solution (0.5 g/1, Sigma #L7647, 10 mM phosphate buffer (pH 6), 1 mM 
CaCl 2 , 0.02% azide) for 20 min with gentle shaking. After removal of 
the solution, rinse once in distilled water, pour off the liquid and let dry 
overnight. The coverslips can be stored at room temperature after 
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coating. Experiment with different concentrations of yeast suspension to 
put on these slides as if the cell density is too high, cells have a tendency 
to clump and not form a single focal plane that is preferable for imaging. 

3. If longer exposures or time-lapse data are required, we use an inverted 
Deltavision microscope. In addition to use of normal microscopic slides, 
which can dry out over time around the edges of the coverslip, depend- 
ing on atmospheric conditions, we often use concanavalin A coated glass 
bottom microwell dishes (MatTek Corporation #P35G-1.5-14-C) with 
the coverslip immersed in enough minimal media (supplemented as 
appropriate for the experiment) to fully cover it. 

4. Beware of the autofluorescent properties of yeast cells! All strains exhibit 
modest autofluorescence in the GFP channel (while adel/ade2 mutants 
exhibit strong autofluorescence), which often increases during glucose 
deprivation and high OD, and which can sometimes form confusing 
foci-like structures. Additionally, weak vacuolar autofluoresence often 
shows up in the RFP/mCh channel. Many proteins in yeast are not 
expressed at levels high enough to easily distinguish between legitimate 
foci and autofluorescence, therefore, ensure you have a firm idea of the 
basal threshold level of intensity at which you can trust your signal in 
your strain background of interest. 

5. Ensure you conduct bleedthrough controls for your fluorescently tagged 
proteins, to ensure that errors in filter set up, or fluorescent protein 
choice do not lead to experimental artifacts. Simply examine one tagged 
protein at a time, but image in all channels you eventually wish to 
examine, ensuring that no signal from one channel is bleeding through 
into another channel. 

6. In order to be confident that the behavior of a given protein changes 
under different conditions, or in different yeast strains, ensure all micro- 
scope settings (e.g., exposure times) and post image-capture manipula- 
tions (e.g., image scaling) are consistent across experiments. 




3. Monitoring Messenger RNA in P-Bodies 

In some cases it is useful or important to determine whether bulk 
mRNA or a specific transcript is accumulating in P-bodies. There are two 
general approaches to determine if specific mRNAs are accumulating in 
P-bodies. First, one can use fluorescence in situ hybridization (FISH) tech- 
niques to monitor the presence of bulk mRNA using an oligo(dT) probe, or 
specific mRNAs using sequence specific probes. Such approaches have 
worked well in mammalian cells (Franks and Lykke-Andersen, 2007; 
Pillai et ah, 2005), and while detection of single mRNA species has been 
demonstrated in yeast (see Garcia et ah, 2007; Zenklusen et ah, 2008), it is 
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currently unclear if the majority of yeast P-bodies (or stress granules) remain 
assembled during FISH protocols (see Brengues and Parker, 2007). Though 
this can likely be optimized, one can nevertheless no longer observe in vivo 
mRNA dynamics after cell fixation. 

In an alternative method, one can use "GFP-tagged" mRNAs to follow 
the localization of specific transcripts in yeast and determine if they accu- 
mulate in P-bodies. To visualize specific mRNAs, multiple binding sites for 
an RNA binding protein fused to a fluorescent protein are inserted into the 
3' UTR of the mRNA of interest allowing its subcellular distribution to be 
examined by following the location of the RNA binding protein fused to 
the fluorescent protein. Most commonly, the well-characterized U1A or 
the MS2 binding sites are inserted into the 3' UTR of the mRNA of interest 
(for detailed methods review, see Bertrand et ah, 1998; Brodsky and Silver, 
2000). These mRNA constructs are then coexpressed with either the U1A- 
GFP (Brodsky and Silver, 2000) or the MS2 coat protein fused to GFP 
(Bertrand et al, 1998). Both of these have nano- to picomolar affinity for 
their respective binding sites allowing detection of the mRNA. 

Several of these types of engineered mRNAs have been constructed on 
plasmids and used to demonstrate the accumulation of specific yeast 
mRNAs in P-bodies. Available plasmids expressing "tagged" versions of 
the stable PGK1 and the unstable MFA2 mRNA are described in 
Table 25.4. In addition, variants of the tagged PGK1 mRNAs are available 
with premature nonsense codons in specific positions, which can be used for 
examining the accumulation of mRNA in P-bodies due to the action of 
NMD (Sheth and Parker, 2006). A variety of plasmids expressing the MS2 
or U1A proteins fused to GFP are also available (Table 25.4). 




4. Determining If a Mutation/Perturbation 
Affects P-Body or Stress Granule Size 
and Number 

4.1. Conditions to observe increases or decreases in P-bodies 
and stress granules 

A common experimental issue is determining if a mutation, protein over- 
expression, or an environmental cue affects the size and number of P-bodies 
or stress granules. To examine if P-bodies or stress granules are altered under 
a certain condition we make three suggestions. First, one should use 
multiple markers of P-bodies or stress granules to ensure that any differences 
seen are not unique to a single protein. Second, since a specific mutant may 
affect P-bodies or stress granules only under certain conditions, we recom- 
mend that P-bodies and stress granules be examined under multiple condi- 
tions (e.g., mid-log growth, glucose deprivation, high OD, etc.). Finally, 



Table 25.4 Plasmids for localizing mRNA in yeast cells: GFP fusion proteins that bind to specific binding sites in mRNA engineered 
in their 3' UTR 



Protein + 


tag 




Plasmid marker Promoter 




Lab plasmid number 


MS2 CP-GFP 




HIS3 


Met25 




pRP1094 a 


U1A-GFI 


> 




TRP1 


GPD 




pRP1187 & 


U1A-GFI 


> 




LEU2 


GPD 




pRP1194 c 




Binding 


Plasmid 










RNA 


seq. 


marker 


Promoter 


Description 1 




Lab plasmid number 


MFA2 


pGMS2 


URA3 


GPD 


Two MS2 sites 3' to poly(G) tract in 3' 


UTR 


pRP1081 J 


MFA2 


MS2 


URA3 


GPD 


Two MS2 sites in 3' UTR 




pRP1083 e 


MFA2 


U1A 


URA3 


GPD 


PGK1 3' UTR with 16 U1A binding sites 


pRP1193 c 


MFA2 


U1A 


URA3 


Tet-Off 


PGK1 3' UTR with 16 U1A binding sites 


pRP1291 c 


PGK1 


MS2 


URA3 


PGK1 


Two MS2 sites 3' to poly(G) tract in 3' 


UTR 


pRP1086 c 


PGK1 


U1A 


URA3 


PGK1 


16 U1A sites in 3' UTR 




pPS2037 
(PRP1354) 7 


PGK1 


U1A 


URA3 


PGK1 


PGK1 U1A with nonsense mutation 
at position 22 




pRP1295^ 


PGK1 


U1A 


URA3 


PGK1 


PGK1 U1A with nonsense mutation 
at position 225 




pRP1296^ 


PGK1 


U1A 


URA3 


GAL 


16 U1A sites in 3' UTR 




pRP1303 /z 



a Beach and Bloom (2001). 

Teixeira et al. (2005). 

Brengues et al. (2005). 

M. Valencia-Sanchez and R. Parker (unpublished). 

Sheth and Parker (2003) have short polyG tract in 3' UTR that does not inhibit exonucleolytic decay. 
1 Brodsky and Silver (2000). 
g Sheth and Parker (2006). 

U. Sheth and R. Parker (unpublished). 

All mRNA constructs have their native 5' and 3' UTR except where noted. 
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we recommend quantification of the number of P-bodies and stress granules 
by computational methods to allow unbiased calculation of the size and 
number of P-bodies and stress granules present in a given situation (see 
Section 5). 

In practice, there are different methods for examining if a perturbation 
reduces or increases granule size or number. To determine if a perturbation 
reduces P-bodies or stress granules, it is most convincing to examine 
conditions when both are strongly induced and easily detectable. For 
example, the intensity and number of P-bodies and stress granules increase 
during glucose deprivation, particularly in the presence of constitutively 
active Gcn2c kinase, which ultimately limits translation initiation by reduc- 
ing functional levels of eIF2 (Buchan et ah, 2008). P-bodies are also 
constitutively induced when decapping or 5' to 3' degradation are inhibited 
by dcplA or xrnlA, respectively. This makes dcplA and xrnlA possible 
strains to examine the presence of proteins in P-bodies, although there is 
the caveat of this being a specific mutant condition that may not reflect the 
normal situation (Buchan et ah, 2008; Teixeira et ah, 2005). Stress granules 
are also increased in dcplA and xrnlA strain during glucose deprivation as 
compared to wild-type cells (Buchan et ah, 2008). Thus, the above condi- 
tions/strains are possible conditions to examine if P-bodies/stress granules 
are reduced by a mutation or physiological response (e.g., see Buchan et ah, 
2008; Decker et ah, 2007; Teixeira and Parker, 2007). Conversely, P-bodies 
are small and infrequent, and stress granules nonexistent when cells are 
undergoing mid-log growth (Buchan et ah, 2008; Teixeira et ah, 2005), 
which makes this condition an ideal situation to see if a given perturbation 
increases either granule (Teixeira and Parker, 2007). P-bodies are also 
clearly induced at high OD, and while distinct aggregates of Pabl, eIF4E, 
and eIF4G are also observed under these conditions (Brengues et ah, 2007; 
Fig. 25. 1C), a rigorous analysis of the composition and assembly mechan- 
isms of this granule have not yet been completed, thus its significance is 
currently unclear. 

4.2. Interpreting alterations in P-body/stress granule 
size and number 

Any alteration observed in P-body size and number, due to a specific 
mutation or alteration in growth, can be due to a variety of mechanisms. 
This is because various changes in cell physiology will affect the size and 
number of P-bodies, probably due to an altered flux of mRNP in and out of 
this structure (Table 25.5). For example, the size and number of P-bodies 
can be increased by defects in mRNA decapping or 5' to 3' degradation, 
which increase the pool of mRNPs in P-bodies by decreasing the destruc- 
tion of mRNAs in this compartment (Sheth and Parker, 2003), or by defects 
in translation initiation, which increase the pool of untranslating mRNPs 
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Table 25.5 Dissection of effects on P-body size and number 



Observation 


Possible cause 


Follow-up experiments 


Increase in 


1. 


Inhibition of 


1 . Examine mRNA degradation 


P-body size 




decapping or 5 f to 


rate and whether mRNAs are 


and number 




3' degradation 


degraded normally 




2. 


Defects in 


2. Measure the rate of protein 






translation 


synthesis by polysomes or S 






initiation 


incorporation 


Decrease in 


1. 


Slowed 


1. Polysome analysis to determine 


P-body size 




translation 


if the size of polysomes is 


and number 




elongation rate 


increased 




2. 


Enhanced 


2. Measure the rates of protein 






translation/ 


synthesis by polysomes or * S 






decreased 


incorporation 






repression 






3. 


Reduced 


3. Western blot to ensure P-body 






expression of 


protein levels 






marker proteins 






4. 


Impaired granule 


4. Mutational/protein interaction 






assembly 


analyses, genetic analyses, etc. 






mechanism 





associated with P-bodies (Brengues et ah, 2005). Alternatively, P-bodies can 
be reduced in size and number by inhibiting translation repression, by 
preventing disassociation of elongating ribosomes from mRNA, by remov- 
ing interactions that promote aggregation of the individual mRNPs into 
larger structures, or by reductions in the level of the P-body marker being 
examined (Coller and Parker, 2005; Decker et ah, 2007). Note that in order 
to be confident of the underlying mechanism affecting P-body size and 
number, additional experiments should be performed to identify the true 
cause of the defect (Table 25.5). 

Based on our findings in yeast, which suggest that P-bodies promote 
yeast stress granule assembly, and existing mammalian cell data, it seems 
likely that all of the above processes could also affect yeast stress granule 
numbers. Indeed, preventing ribosomal dissociation (cycloheximide) or 
limiting translational repression (dhhlA, patlA, dhhlA, and pat 1 A strains) 
inhibits stress granule assembly, whereas specific blocks in mRNA decay 
(dcplA, xrnlA), increase stress granules (Buchan et ah, 2008). Moreover, 
because defects in P-body formation can reduce stress granules in budding 
yeast (Buchan et ah, 2008), an observed defect in stress granules could be 
due to alterations in P-body formation or function. 
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5. Quantification of P-Body Size and Number 

A common and important issue is determining if quantitative differ- 
ences exist in the size and number of P-bodies or stress granules under 
different cellular conditions, or between particular mutant strains. Ideally, 
one should use unbiased approaches to determine the effects. Thus, having 
one person prepare the images and another person score them blindly is 
strongly recommended. 

In our experiences to date, P-bodies and stress granules present different 
quantification challenges, especially when strains bearing plasmid-expressed 
versions of a stress granule marker are used. This is because most stress 
granule markers (e.g., Pabl, Publ) are distributed throughout the cyto- 
plasm, and the intensity of a Pabl or Publ stress granule foci is often not 
much greater than the intensity of the diffuse cytoplasmic protein (e.g., 1.5— 
4-fold difference). Given that yeast cells can harbor differing copy numbers 
of plasmids from cell to cell, in practice, the intensity of a stress granule focus 
in one cell may not be as bright as the diffuse cytoplasmic signal in another 
cell, making automated counting methods using thresholds problematic. 
This issue can be countered by using fluorescent tagged proteins in the 
chromosome since this reduces, but does not eliminate, cell to cell varia- 
bility. Such tagged proteins can be made using standard techniques and 
vectors (Longtine et ah, 1998). In contrast, most P-body proteins exhibit a 
very low diffuse cytoplasmic signal, and a very high foci signal (e.g., 5— 30- 
fold difference under some conditions). Thus, for P-bodies, semiautomated 
scoring approaches are much more feasible. 

In our lab, scoring of both granules is accomplished by using ImageJ, a 
freely downloadable image analysis package from the NIH (Abramoff et ah, 
2004), although other software and algorithms have been successfully 
employed elsewhere for similar purposes (e.g., "MatLab"; Aragon et ah, 
2008). On the ImageJ web site, there are instructions and downloadable 
plugins, which allow direct loading of raw images from the microscope. 
An alternative source of ImageJ, preloaded with many useful image analysis 
plugins, can be obtained from the Wright Cell Imaging facility web site — 
http : // www. uhnres . utoronto . ca/facilities/ wcif/ do wnload. php . 

The semiautomated quantification of P-bodies is accomplished by 
setting a threshold mask, which allows regions of the image above a certain 
intensity to be scored as "on" and the rest of the image "off." The P-bodies 
are counted by the number of "on" regions and their area, intensity and 
number can be computed. To further reduce bias, the masking procedure, 
to determine "real" foci, can be performed automatically using the Otsu 
Thresholding Filter (Otsu, 1979). 

Protocols for quantifying P-bodies and stress granules are detailed in the 
following sections. 
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5.1. Semiautomated quantification of P-body 
size and number 

1. Open the image in ImageJ to quantify. If using a Z-stack image, rather 
than a single plane image, the former of which benefits from capturing 
the entire thickness of the cell, ensure that the Z-stack is first collapsed 
(Image menu — Stacks — Z-project — choose Max intensity projection). 
Collapsed Z-stack images will capture all foci in a cell, and therefore are 
best for accurate quantitation. In contrast, single-plane images tend to be 
more "aesthetically pleasing," as the total cellular volume is not averaged 
into one plane; this also makes colocalization of proteins less 
problematic. 

2. Next, select Process, then Smooth — this subtly averages intensity of each 
pixel with the intensity of surrounding pixels, helping discriminate real 
foci from image capture artifacts. 

3. Go to Process, select Math and then select Subtract to remove the 
background. This is a key step in the process, which helps discriminate 
meaningful foci from background noise caused by the media, the micro- 
scope, and the diffuse level of the protein/RNA being examined that is 
present throughout the cell. This level should be individually tailored 
according to the use of different microscopes, different exposure condi- 
tions, and examination of different proteins. Note that because of 
threshold subtractions, such quantifications are not necessarily absolute 
numbers or areas of P-bodies but provide a systematic and unbiased 
measure of relative P-body number and area within experiments. 

4. Go to Plugins, select Filters and then select Otsu Thresholding. This 
option will only be available if you have downloaded the plugin and 
placed it in the Plugins directory, within the ImageJ directory, on your 
computer (see ImageJ web site for plugins). 

5. Go to Image, select Lookup Tables and then select Invert LUT. This 
will reverse the image, allowing the P-bodies to be considered "on." 

6. Go to Analyze and select Analyze Particles, and in the size bar, set the 
pixel size range to be counted. This defines the size at which a focus is 
considered to be a real P-body, for which the appropriate range will vary 
depending on image resolution, samples, and strains. Optimization is 
required so random speckles do not count as P-bodies and large fluores- 
cent regions unrelated to P-bodies are not counted (e.g., autofluorescent 
debris in the media) . After pressing ok, a table is generated that lists the 
number of foci, average area, and total area of each foci. Modifying the 
data reported for each focus within the foci list can be achieved by going 
to Analyze — Set Measurements, then ticking the boxes for each param- 
eter you wish to measure. 

7. Optional: Prior to hitting ok in step six, select Masks in the pull down 
menu — in addition to the tabulated data, this will present a graphical 
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display of the thresholded foci so that you can visually check no erro- 
neous foci have been counted. 
8. The number of cells in each image must be calculated manually, however, 
all steps except cell counting can be automated by writing/ re cording a 
macro (Plugins — Macros — Record) to perform them to obtain faster 
throughput analyses. Copying the data output from step 6 directly into 
programs such as Microsoft Excel allows easy manipulation of the data. 



5.2. Manual quantification of stress granule size and number 

Repeat steps 1 and 2 from above 

3. As a uniform background level from which foci can be distinguished is 
harder to obtain, it is easier to divide the image into sections (e.g., group 
of cells with similar cytoplasmic intensity), and quantify separately. 
To do this, click any of the four shape buttons (bottom left on ImageJ 
bar), to either draw rectangles, circles, polygon, or freehand selections 
on the image. 

4. Go to Image — Adjust — Threshold, and choose intensity limits (use 
slider bars or\set values manually) within which your foci signal, but 
not diffuse cytoplasmic signal, lie. 

5. Repeat steps 6—8 as necessary. 



ACKNOWLEDGMENTS 

We thank the members of the Parker lab for helpful discussions, particularly Carolyn Decker 
for proofreading and suggestions. NIH grant (R37 GM45443) and funds from the Howard 
Hughes Medical Institute supported this work. TN was supported in part by T32 CA09213. 



REFERENCES 

Abramoff, M. D., Magelhaes, P. J., and Ram, S. J. (2004). Image processing with image. 

J. Biophotonks Int. 11, 36-42. 
Anderson, P., and Kedersha, N. (2006). RNA granules. J. Cell Biol. 172, 803-808. 
Anderson, P., and Kedersha, N. (2008). Stress granules: The Tao of RNA triage. Trends 

Biochem. Sci. 33, 141-150. 
Aragon, T., van Anken, E., Pincus, D., Serafimova, I. M., Korennykh, A. V., Rubio, C. A., 

and Walter, P. (2008). Messenger RNA targeting to endoplasmic reticulum stress 

signaling sites. Nature 457, 736-740. 
Barbee, S., et al. (2006). Staufen- and FMRP-containing neuronal RNPs are structurally and 

functionally related to somatic P bodies. Neuron 52, 997-1009. 
Beach, D. L., and Bloom, K. (2001). ASH1 mRNA Localization in three acts. Mol. Biol. Cell 

12, 2567-2577. 



638 J. Ross Buchan et al. 



Beckham, C. J., et al. (2007). Interactions between brome mosaic virus RNAs and cytoplas- 
mic processing bodies. J. Virol 81, 9759-9768. 
Beckham, C, et al. (2008). The DEAD-box RNA helicase Dedlp affects and accumulates in 

Saccharomyces cerevisiae P-bodies. Mol. Biol. Cell 19, 984-993. 
Bertrand, E., Chartrand, P., Schaefer, M., Shenoy, S. M., Singer, R. H., and Long, R. M. 

(1998). Localization of ASH1 mRNA particles in living yeast. Mol. Cell 2, 437-445. 
Bhattacharyya, S. N., Habermacher, R., Martine, U., Closs, E. I., and Filipowicz, W. 

(2006). Relief of microRNA-mediated translational repression in human cells subjected 

to stress. Cell 125, 1111-1124. 
Brengues, M., and Parker, R. (2007). Accumulation of polyadenylated mRNA, Pablp, 

eIF4E, and eIF4G with P-bodies in Saccharomyces cerevisiae. Mol. Biol. Cell 18, 2592-2602. 
Brengues, M., Teixeira, D., and Parker, R. (2005). Movement of eukaryotic mRNAs 

between polysomes and cytoplasmic processing bodies. Science 310, 486-489. 
Brodsky, A. S., and Silver, P. A. (2000). Pre-mRNA processing factors are required for 

nuclear export. RNA 6, 1737-1749. 
Buchan, J. R., Muhlrad, D., and Parker, R. (2008). P bodies promote stress granule assembly 

in Saccharomyces cerevisiae. J. Cell Biol. 183, 441-455. 
Buu, L. M., Jang, L. T., and Lee, F. J. (2004). The yeast RNA-binding protein Rbplp 

modifies the stability of mitochondrial porin mRNA. J. Biol. Chem. 279, 453-462. 
Coller, J., and Parker, R. (2005). General translational repression by activators of mRNA 

decapping. Cell 122, 875-886. 
Cougot, N., Babajko, S., and Seraphin, B. (2004). Cytoplasmic foci are sites of mRNA 

decay in human cells. J. Cell Biol. 165, 31-40. 
Dang, Y., Kedersha, N., Low, W. K., Romo, D., Gorospe, M., Kaufman, R., Anderson, P., 

and Liu, J. O. (2006). Eukaryotic initiation factor 2 alpha independent pathway of stress 

granule induction by the natural product pateamine A.J. Biol. Chem. 281, 32870-32878. 
Decker, C. J., Teixeira, T., and Parker, R. (2007). Edc3p and a glutamine/asparagine-rich 

domain of Lsm4p function in processing body assembly in Saccharomyces cerevisiae.]. Cell 

Biol. 179, 437-449. 
Duttagupta, R., Tian, B., Wilusz, C. J., Khounh, D. T., Soteropoulos, P., Ouyang, M., 

Dougherty, J. P., and Peltz, S. W. (2005). Global analysis of Publp targets reveals 

a coordinate control of gene expression through modulation of binding and stability. 

Mol. Cell. Biol. 25, 5499-5513. 
Franks, T. M., and Lykke-Andersen, J. (2007). TTP and BRF proteins nucleate processing 

body formation to silence mRNAs with AU-rich elements. Genes Dev. 21, 719—735. 
Gaillard, H., and Aguilera, A. (2008). A novel class of mRNA-containing cytoplasmic 

granules are produced in response to UV-irradiation. Mol. Biol. Cell 19, 4980-4992. 
Garcia, M., Darzacq, X., Delaveau, T., Jourdren, L., Singer, R. H., andjacq, C. (2007). 

Mitochondria- associated yeast mRNAs and the biogenesis of molecular complexes. 

Mol. Biol Cell 18, 362-368. 
Grousl, T., Ivanov, P., Frydlova, I., Vasicova, P., Janda, F., Vojtova, J., Malinska, K., 

Malcova, I., Novakova, L., Janoskova, D., Valasek, L., and Hasek, J. (2009). Robust 

heat shock induces eIF2alpha-phosphorylation-independent assembly of stress granules 

containing eIF3 and 40 S ribosomal subunits in budding yeast, Saccharomyces cerevisiae. 

J. Cell Sci. 122, 2078-2088. 
Hatfield, L., Beelman, C. A., Stevens, A., and Parker, R. (1996). Mutations in trans-acting 

factors affecting mRNA decapping in Saccharomyces cerevisiae. Mol. Cell. Biol. 16, 

5830-5838. 
Holmes, L. E., Campbell, S. G., De Long, S. K., Sachs, A. B., and Ashe, M. P. (2004). Loss 

of translational control in yeast compromised for the major mRNA decay pathway. Mol. 

Cell. Biol. 24, 2998-3010. 



P-Bodies and Stress Granules in Budding Yeast 639 



Hoyle, N. P., Castelli, L. M., Campbell, S. G., Holmes, L. E., and Ashe, M. P. (2007). 

Stress-dependent relocalization of translationally primed mRNPs to cytoplasmic granules 

that are kinetically and spatially distinct from P -bodies. J. Cell Biol. 179, 65—74. 
Huh, W. K., Falvo, J. V., Gerke, L. C., Carroll, A. S., Howson, R. W., Weissman, J. S., and 

O'Shea, E. K. (2003). Global analysis of protein localization in budding yeast. Nature 425, 

686-691. 
Jang, L. T., Buu, L. M., and Lee, F.J. (2006). Determinants of Rbplp localization in specific 

cytoplasmic mRNA-processing foci, P-bodies.J. Biol. Chem. 281, 29379-29390. 
Kedersha, N., Chen, S., Gilks, N., Li, W., Miller, I. J., Stahl, J., and Anderson, P. (2002). 

Evidence that ternary complex (eIF2-GTP-tRNA(i)(Met))-deficient preinitiation 

complexes are core constituents of mammalian stress granules. Mol. Biol. Cell 13, 

195-210. 
Kedersha, N., Stoecklin, G., Ayodele, M., Yacono, P., Lykke-Andersen, J., Fritzler, M. J., 

Scheuner, D., Kaufman, R. J., Golan, D. E., and Anderson, P. (2005). Stress granules and 

processing bodies are dynamically linked sites of mRNP remodeling. J. Cell Biol. 169, 

871-884. 
Kshirsagar, M., and Parker, R. (2007). Identification of Edc3p as an enhancer of mRNA 

decapping in Saccharomyces cerevisiae. Genetics 166, 729-739. 
Liu, J., Rivas, F. V., Wohlschlegel, J., Yates, J. R. 3rd, Parker, R., and Hannon, G.J. (2005). 

A role for the P-body component GW182 in microRNA function. Nat. Cell Biol. 7, 

1261-1266. 
Longtine, M. S., McKenzie, A. 3rd, Demarini, D. J., Shah, N. G., Wach, A., Brachat, A., 

Philippsen, P., and Pringle, J. R. (1998). Additional modules for versatile and economical 

PCR-based gene deletion and modification in Saccharomyces cerevisiae. Yeast 14, 953-961. 
Lotan, R., et al. (2005). The RNA polymerase II subunit Rpb4p mediates decay of a specific 

class of mRNAs. Genes Dev. 19, 3004-3016. 
Luke, B., et al. (2007). Saccharomyces cerevisiae Ebslp is a putative ortholog of 

human Smg7 and promotes nonsense-mediated mRNA decay. Nucleic Acids Res. 35, 

7688-7697. 
Malys, N., and McCarthy, J. E. (2006). Dcs2, a novel stress-induced modulator of m7G 

pppX pyrophosphatase activity that locates to P bodies. J. Mol. Biol. 363, 370-382. 
Mazroui, R., Sukarieh, R., Bordeleau, M. E., Kaufman, R. J., Northcote, P., Tanaka, J., 

Gallouzi, I., and Pelletier, J. (2006). Inhibition of ribosome recruitment induces stress 

granule formation independently of eukaryotic initiation factor 2 alpha phosphorylation. 

Mol. Biol. Cell 17, 4212-4219. 
Muhlrad, D., and Parker, R. (2005). The yeast EDC1 mRNA undergoes deadenylation- 

independent decapping stimulated by Not2p, Not4p, and Not5p. EMBO J. 24, 

1033-1045. 
Otsu, N. (1979). Threshold selection method from gray-level histograms. IEEE Trans. Syst. 

Man Cybern. 9, 62-66. 
Parker, R., and Sheth, U. (2007). P bodies and the control of mRNA translation and 

degradation. Mol. Cell. 25, 635-646. 
Pilkington, G. R., and Parker, R. (2008). Patl contains distinct functional domains that 

promote P-body assembly and activation of decapping. Mol. Cell. Biol. 28, 1298—1312. 
Pillai, R. S., Bhattacharyya, S. N., Artus, C. G., Zoller, T., Cougot, N., Basyuk, E., 

Bertrand, E., and Filipowicz, W. (2005). Inhibition of translational initiation by Let-7 

MicroRNA in human cells. Science 309, 1573-1576. 
Rendl, L. M., Bieman, M. A., and Smibert, C. A. (2008). S. cerevisiae Vtslp induces 

deadenylation-dependent transcript degradation and interacts with the Ccr4p-Pop2p- 

Not deadenylase complex. RNA 14, 1328-1336. 
Ruiz-Echevarria, M. J., and Peltz, S. W. (2000). The RNA binding protein Publ modulates 

the stability of transcripts containing upstream open reading frames. Cell 101, 741-751. 



640 J. Ross Buchan et al. 



Segal, S. P., Dunckley, T., and Parker, R. (2006). Sbplp affects translational repression and 

decapping in Saccharomyces cerevisiae. Mol. Cell. Biol. 26, 5120-5130. 
Sheth, U., and Parker, R. (2003). Decapping and decay of messenger RNA occur in 

cytoplasmic processing bodies. Science 300, 805-808. 
Sheth, U., and Parker, R. (2006). Targeting of aberrant mRNAs to cytoplasmic processing 

bodies. Cell 125, 1095-1109. 
Stribinskis, V., and Ramos, K. S. (2007). Rpm2p, a protein subunit of mitochondrial RNase 

P, physically and genetically interacts with cytoplasmic processing bodies. Nucleic Acids 

Res. 35, 1301-1311. 
Sweet, T. J., et al. (2007). Microtubule disruption stimulates P-body formation. RNA 13, 

493-502. 
Tadauchi, T., Inada, T., Matsumoto, K., and Irie, K. (2004). Posttranscriptional regulation 

of HO expression by the Mktl-Pbpl complex. Mol. Cell Biol. 24, 3670-3681. 
Teixeira, D., and Parker, R. (2007). Analysis of P-body assembly in Saccharomyces cerevisiae. 

Mol. Biol. Cell 18, 2274-2287. 
Teixeira, D., Sheth, U., Valencia-Sanchez, M. A., Brengues, M., and Parker, R. (2005). 

Processing bodies require RNA for assembly and contain nontranslating mRNAs. RNA 

11, 371-382. 
Tharun, S., Muhlrad, D., Chowdhury, A., and Parker, R. (2005). Mutations in the 

Saccharomyces cerevisiae LSM1 gene that affect mRNA decapping and 3' end protection. 

Genetics 170, 33-46. 
Unterholzner, L., and Izaurralde, E. (2004). SMG7 acts as a molecular link between mRNA 

surveillance and mRNA decay. Mol. Cell 16, 587-596. 
Vasudevan, S., Garneau, N., Tu, Khounh D., and Peltz, S. W. (2005). p38 mitogen- 

activated protein kinase/Hoglp regulates translation of the AU-rich-element-bearing 

MFA2 transcript. Mol. Cell. Biol. 25, 9753-9763. 
Zenklusen, D., Larson, D. R., and Singer, R. H. (2008). Single-RNA counting reveals 

alternative modes of gene expression in yeast. Nat. Struct. Mol. Biol. 15, 1263-1271. 




CHAPTER TWENTY-SIX 

Analyzing mRNA Expression Using 
Single mRNA Resolution Fluorescent 
In Situ Hybridization 

Daniel Zenklusen and Robert H. Singer 



Contents 

1. Introduction 642 

2. Probe Design 643 

3. Probe Labeling 645 

3.1. Materials 645 

3.2. Protocol 646 

3.3. Measuring labeling efficiency 646 

4. Cell Fixation, Preparation, and Storage 647 

4.1. Materials 648 

4.2. Protocol 648 

5. Hybridization 650 

5.1. Materials 650 

5.2. Probes used for the hybridization shown in Figs. 26.1 and 26.3 651 

5.3. Protocol 652 

6. Image Acquisition 653 
6.1. Microscope (example) 654 

7. Image Analysis 654 

8. Summary and Perspectives 657 
Acknowledgments 658 
References 658 



Abstract 

As the product of transcription and the blueprint for translation, mRNA is the 
main intermediate product of the gene expression pathway. The ability to 
accurately determine mRNA levels is, therefore, a major requirement when 
studying gene expression. mRNA is also a target of different regulatory steps, 
occurring in different subcellular compartments. To understand the different 
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steps of gene expression regulation, it is therefore essential to analyze mRNA in 
the context of a single cell, maintaining spatial information. Here, we describe a 
stepwise protocol for fluorescent in situ hybridization (FISH) that allows detec- 
tion of individual mRNAs in single yeast cells. This method allows quantitative 
analysis of mRNA expression in single cells, permitting "absolute" quantifica- 
tion by simply counting mRNAs. It further allows us to study many aspects of 
mRNA metabolism, from transcription to processing, localization, and mRNA 
degradation. 




1. Introduction 

The life cycle of an mRNA comprises many different steps. Starting 
with mRNA synthesis, mRNAs are processed, assembled into mRNPs, 
exported from the nucleus, sometimes localized, usually translated, and 
ultimately always degraded. These different steps along the gene expression 
pathway are tightly regulated and many are subjected to quality control steps 
that ensure their proper execution (Houseley and Tollervey, 2009; Moore 
and Proudfoot, 2009). How these different steps are carried out and what 
proteins are involved in these processes has been the focus of gene expression 
studies over the last few decades. The ability to detect and quantify mRNA 
levels was thus the key requirement. Traditionally, mRNA detection is 
achieved using some kind of hybridization technique. While Northern blots 
are able to detect only a few mRNAs at the time, array technologies now 
allow expression studies of an entire organism in a single experiment 
(Ausubel, 1988; Coppee, 2008; Holstege et al, 1998). 

One limitation of arrays or Northern blots, however, is that large 
numbers of cells are required to isolate sufficient material to perform an 
experiment. Additionally, cells must be broken up to isolate RNA and 
RNA get lost or degraded during the isolation procedure. Therefore, spatial 
information gets lost. The steps along the gene expression pathway occur in 
different cellular compartment and preserving spatial information is often 
critical to understand cellular processes. Furthermore, variability among 
different cells in a population cannot be observed by ensemble measure- 
ments. Cells from different cell cycle or developmental stages express 
unique sets of genes and such alternate expression profiles are obscured 
when pooling cells. Finally, expression "noise" resulting from stochastic 
fluctuations in biological processes cannot be observed without single cell 
analysis (Elowitz et al, 2002; Kaufmann and van Oudenaarden, 2007). 

These limitations are circumvented by single cell analysis (Kaufmann and 
van Oudenaarden, 2007; Zenklusen et al, 2008). Spatial information 
and cell-to-cell differences become easily observed when molecules are 
detected in single cells, made possible by the extensive use of fluorescent 
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proteins (Shaner et ah, 2007). To analyze mRNA expression, however, 
single cell techniques are less widely used. Fluorescent in situ hybridization 
(FISH) is the most robust and straight-forward method for single cell 
mRNA analysis (Dong et ah, 2007; Long et ah, 1997; Zenklusen et ah, 
2008). To detect mRNA in cells, fluorescently labeled probes are hybri- 
dized to fixed cells immobilized on glass slides. The technique is noninva- 
sive, as no genetic modifications are necessary. Choosing well-designed 
probes coupled with bright fluorescent dyes allows the detection of single 
mRNA molecules in single yeast cells (Dong et ah, 2007; Femino et ah, 
1998; Long et ah, 1997; Zenklusen et ah, 2008). The applications of this 
technique in gene expression analysis have a wide range; we have used FISH 
to study transcription, splicing, and mRNA localization (Dong et ah, 2007; 
Long et ah, 1997; Zenklusen et ah, 2008). 

Yeast is an ideal system to perform single molecule expression analyses. 
Many genes in yeast are expressed at a very low level of less than 10 copies 
per cell (Holstege et ah, 1998; Zenklusen et ah, 2008). Therefore, "abso- 
lute" quantification of mRNA expression can be performed; the number of 
mRNA molecules can be determined simply by counting. The small size of 
a yeast cell is also advantageous in this case, allowing analysis of expression 
levels in many single cells simultaneously. Expression and localization 
studies can, therefore, be performed with unprecedented precision. 

In this chapter, we will progress through the different steps of 
performing a single mRNA resolution FISH experiment. We begin with 
how probes are designed and labeled before we describe a step-by-step 
protocol for FISH. Finally, we briefly describe some aspects of data analysis. 




2. Probe Design 

A crucial step of a successful FISH experiment is designing FISH probes. 
To achieve single molecule sensitivity, multiple oligonucleotide probes, each 
labeled with up to five fluorescent dyes are hybridized to an mRNA 
(Fig. 26.1). To allow the coupling of multiple dyes onto one probe, a minimal 
probe length is required. Probes should also be long enough to ensure high 
specificity and allow stringent hybridization conditions. Probes of around 50 
nucleotides (nt) in length with about 50% CG content typically work best, 
demonstrating high specificity under stringent hybridization conditions. As 
multiple probes against one gene are used, it is important to design probes with 
similar melting temperature. Using these standard settings (50 nt/50% CG) 
during probe design also facilitates the simultaneous use of differentially labeled 
probes against multiple target mRNAs (Fig. 26.1). 

Probes are designed using commercial DNA sequence analysis software 
such as Oligo (Molecular Biology Insights, Inc.). To find target sites, 
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Figure 26.1 Single mRNA sensitivity fluorescent in situ hybridization (FISH). 
(A) Schematic diagram of the FISH protocol. A mix of four 50 nt DNA oligonucleo- 
tides, each labeled with five fluorescent dyes, is hybridized to paraformaldehyde fixed 
yeast cells to obtain single transcript resolution. (B) Single mRNA FISH for MDN1 
mRNA. Single mRNAs are detected in the cytoplasm, higher intensity spot in the 
nucleus. Haploid and diploid yeast cells are shown. Probes hybridize to the 5' of the 
mRNA. MDN1 mRNA (red), DAPI (blue), and DIC. (C) Cartoon illustrating that the 
number of nascent mRNAs at the site of transcription is used to determine the 
polymerase loading on a gene using probes to the 5 ; end of MDN1. (D) Nascent 
transcripts of neighboring genes colocalize at the site of transcription. Diploid cells 
are hybridized with probes against MDN1 (red) and CCW12 (green). Nucleolus is 
stained with probes against the ITS2 spacer of the rRNA precursor (yellow). Maximum 
projection of 3D-dataset and single plane containing the transcription sites are shown 
(Zenklusen et al., 2008). 



the gene of interest is scanned for 50 nt complementary sequences with 
~50% CG content. If none fitting the criteria can be found, the length of 
the probe can be adjusted by adding or removing a few bases while keeping 
a similar melting temperature. It is important not use probes forming stable 
secondary structures as this may interfere with efficient hybridization. Avoid 
using probes forming internal stem loops with a AG > —2.5 kcal/mol. 
Probes should also be tested for cross-hybridization to other genes, for 
example, by using Blast in SGD (Saccharomyces Genome database). 
Strong sequence homology is rare but can challenge probe design, for 
example, when designing probes for ribosomal protein genes, present in 
two copies per genome with strong sequence homology. 

To incorporate multiple labels into a single DNA oligonucleotide probe, 
modified bases are inserted during synthesis. Inserting amino-allyl dTs 
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allows efficient coupling with most commercially available dyes after syn- 
thesis. To avoid quenching of dyes, modified bases should be spaced by 
8—10 nt. Different companies synthesize oligos containing internal labels, 
but due to relatively high costs it is often preferable to synthesize the probes 
on site if a DNA synthesis facility or a DNA synthesizer is available. 
Alternatively, probes containing a single modified base can be used. Such 
probes are synthesized by most companies and are much cheaper compared 
to probes bearing multiple labels. However, more probes have to be used to 
allow single molecule detection (Raj et ah, 2008). 




3. Probe Labeling 

Single molecule detection requires high-labeling efficiencies. We use 
cyanine dyes, containing a monofunctional NHS-ester for efficient cou- 
pling to amino-allyl Ts. Cy3, Cy3.5, and Cy5 (CyDye , GE Healthcare) 
work well, but other dyes with monofunctional NHS-ester from other 
companies can be used. Dyes in the green (emission below 500 nm) are 
less well suited for FISH in yeast as cells show more background fluores- 
cence and single molecule detection becomes difficult. 

Labeling is done as described by the manufacturer with minor modifica- 
tions. We prepurify probes prior to labeling using a QIAquick Nucleotide 
Removal Column (Quiagen), as this has been shown to increase labeling 
efficiency. Five micrograms of DNA oligonuleotide is labeled using a single 
Amersham Cy3, Cy3.5, or Cy5 dye pack. When multiple probes against 
one gene are used, probes can be pooled in equal molar ratios and the probe 
mix is labeled together. 

Labeling efficiency is determined by measuring absorption in a spectro- 
photometer. If available, use a NanoDrop (Thermo Fisher), which allows 
measuring of low volumes (1 /il), therefore, reducing probe loss. Labeling 
efficiency is calculated using a formula that corrects for absorption of the 
fluorophore at 260 nm. Labeling of >90% should be obtained. For 
unknown reasons, labeling efficiency of Cy3.5 is generally lower (75—80%). 



3.1. Materials 

• DNA oligonucleotide containing amino-allyl modified Ts 

• Mono-Reactive CyDye Cy3, Cy3.5, and Cy5 (GE Healthcare. 
#PA23001, PA23501, PA25001) 

• QIAquick Nucleotide Removal Kit (Qiagen #28304) 

• Spectrophotometer 

• Labeling buffer (0.1 M sodium bicarbonate, pH 9.0) 
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3.2. Protocol 

1. Measure concentrations of unlabeled DNA oligonucleotides. 

2. When using multiple probes against one gene, combine probes to total 
of 5 fig of probes per labeling. For example, when using four probes to 
gene A, use 1.25 fig of each). 

3. Add 500 fA of buffer PN from QIAquick Kit, mix. 

4. Purify on QIAquick column according to the protocol. To increase 
binding, load the sample twice onto the same column. 

5. Elute probes from columns using 40 fA H 2 0. Do not use the elution 
buffer from the kit. 

6. Lyophilize probes in a SpeedVac. 

7. Resuspend the DNA pellet in 10 fA labeling buffer and add to the dye 
containing tube. 

8. Resuspend the dye by vortexing vigorously and then perform a quick 
spin to collect the labeling reaction at the bottom of the tube. 

9. Incubate in the dark at room temperature overnight. 

Purify the probes from the free dye using QIAquick columns: 

10. Add 500 fA of buffer PN to the labeling reaction and load onto column. 

1 1 . Spin through columns according to the protocol. 

12. Load the flow-through a second time onto the same column to increase 
probe recovery. 

13. Spin through columns according to the protocol. 

14. Wash column twice with buffer PE to remove all nonincorporated dye. 

15. Elute the labeled probes using 100 fA of elution buffer. 

1 6 . Measure concentration and labeling efficiency using a spectrophotometer. 

17. Store probes at —20 °C in the dark. 



3.3. Measuring labeling efficiency 

To calculate the labeling efficiency, the extinction coefficient and the 
absorbance of the dye and the oligo at 260 nm and the emission peak of 
the dye have to be considered. The molar extinction coefficient (e) of the 
DNA oligonucleotides is calculated as described by Beer— Lambert law 
(Cavaluzzi and Borer, 2004). A web site from an oligo synthesis company 
could be used for the calculation (we use http://www.idtdna.com/ 
analyzer/ Applications/ Oligo Analyzer). To calculate the molecular weight 
of the amino -modified oligo, add 179.16 g/mol per modified base to the 
calculated molecular weight of the unmodified oligo. 

The exact DNA concentration [DNA] is calculated using Eq. (26.1), the 
dye concentration [Dye] using formula (26.2). The labeling efficiency is 
then determined by dividing the [Dye] /[DNA] by the number of modified 
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bases on the probe (26.3). v4 DNA is the absorption of the sample at 260 nm. 
A dye is the absorption at absorbance max of the dye. 8 dye is extinction 
coefficient of the dye, 8 DNA the extinction coefficient of the DNA: 

r DNA l = ^DNA ~ Sdye(260) X (^dye Adye(max)) ^ ^ 

£ dna X 0.1 cm 
[Dye] = ^ dye(maX) (26.2) 

^dye X 0.1 Cm 

[Dye] 1 , N 

Labeling efficiency = - - X - (26.3) 

Extinction coefficients of the dyes at 260 nm (£260) an d their absorption 
maximum (e max ) are shown in the table as follows: 

Dye e 2 60 £ max Absorbance (nm) Emission (nm) 

Cy3 12,000 (8%) 150,000 550 570 

Cy3.5 40,800 (24%) 170,000 581 596 

Cy5 12,500 (5%) 250,000 649 670 




4. Cell Fixation, Preparation, and Storage 

To prepare cells for FISH, cells are grown in the appropriate media 
and fixed by adding paraformaldehyde directly to the media. After extensive 
washes, the cell wall is removed using lyticase. Cells are digested in an 
isotonic buffer to prevent cells from bursting after the cell wall has been 
removed. Cells also become very fragile and strong shearing forces (exten- 
sive pipetting and vortexing) will break the cells open, so gentle handling is 
required. Complete digestion, however, is necessary to obtain optimal 
FISH results. Progression of the digest is, therefore, observed by visual 
inspection using phase contrast. Cells will turn dark when the cell wall is 
digested away, whereas undigested cells look transparent. Avoid digesting 
cells for too long, as overdigestion can lead to cell lysis. 

Following digestion, cells are attached to coverslips. Using round 18 mm 
cover glass slips allows most subsequent steps to be performed in 12-well 
tissue culture plates. The cover glass is coated with poly-L-lysine for cells to 
attach. Alternatively, precoated coverslips can be purchased from different 
vendors. Cells are spotted on coverslips and allowed to settle by gravity. 
Unadhered cells are washed off and coverslips are finally stored in 70% 
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ethanol at —20 °C. Ethanol dissolves membranes allowing better penetra- 
tion of probes during the hybridization step and serves at the same time as a 
preservative, permitting cells to be stored for many months. 

4.1. Materials 

• Paraformaldehyde 32% solution, EM grade (Electron Microscopy Sci- 
ence #15714) 

• Lyticase (Sigma # L2524, resuspend in 1 X PBS to 25,000 U/ml. Stored 
at-20°C) 

• Ribonucleoside-vanadyl complex (VRC; NEB #S1402S) 

• j6-Mercap to ethanol 

• Sorbitol 

• 1 MKHP0 4 , pH7.5 

• 70% ethanol 

• Noncoated coverslips (Fisherbrand Cover Glasses Circles No. 1: 0.13— 
0.17 mm thick; size: 18 mm (#12-545-100)) or 

• Precoated coverslips (Fisherbrand Coverglass for growth 18 mm (12-545- 
84)) 

• Poly-L-lysine (#P8920) 

• 12-well cell culture plates 

Solutions to be prepared: 

• Buffer B 1.2 M sorbitol, 100 mMKHP0 4 , pH 7.5 

• Spheroplast 1.2 M sorbitol 

buffer 100 mM KHPO4, pH 7.5 

20 mM ribonucleoside-vanadyl complex (VRC; 

NEB #S1402S) 
20 mM /J-mercaptoethanol 
Lyticase (25 U lyticase per OD of cells) 

• Respuspention 1.2 M sorbitol 

buffer 100 mM KHPO4, pH 7.5 

20 mM ribonucleoside-vanadyl complex 



4.2. Protocol 

• Growth and fixation 

1. Grow 50 ml BY4741 cells in YPD in a 250-ml flask at 30 °C on an 
orbital shaker to an optical density at 600 nm (OD 600) of 0.6. 

2. Prepare a 50-ml Falcon tube containing 6.3 ml of 32% (v/v) parafor- 
maldehyde. Paraformaldehyde is toxic, wear gloves and handle in the 
fume hood! 
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3. Fix cells by transferring 43.7 ml of culture to a 50-ml tube containing 
the paraformaldehyde (final concentration of 4%, v/v) and mix. 

4. Incubate cells for 45 min at room temperature on a tabletop shaker. 

5. Collect cells by centrifugation using a swinging bucket rotor at 
3500 rpm at 4 °C. 

6. Wash cells three times with 10 ml of cold buffer B. 

7. Resuspend cells in 1 ml buffer B and transfer cells to a 1.5-ml Eppen- 
dorf tube. 

8. Pellet cells using tabletop centrifuge (3 min, 4000 rpm). 

• Digestion 

9. Resuspend cells in 1 ml spheroplast buffer plus 30 fA of lyticase (at 
25 U//il). 

10. Incubate cells at 30 °C for 8 min. 

1 1 . Check the progression of the digest using a phase contrast microscope. 
Place 3.5 fA on a microscope slide, cover with a coverglass and inspect 
digestion using a 20 X objective. Undigested cells are transparent while 
digested cells will turn dark. If > 80% of cells are digested proceed to 
step 12. If fewer cells are digested, continue incubation and check for 
digestion every 2—3 min. 

12. Collect cells by centrifugation for 3 min at 3500 rpm at 4 °C. Do not 
spin at a higher speed or cells will break. 

13. Wash cells with 1 ml of cold buffer B (pipette carefully). 

14. Resuspend pellet in 1.5 ml of buffer B, keep on ice. 

• Attaching cells to coverslips 

15. Place poly-L-lysine treated 18 mm round coverslips face up into 
12-well tissue culture dishes, one coverslip per well. 

16. Drop 150 fA of cells to the center of a coated coverslip. 

17. Let cells settle for 30 min at 4 °C. 

18. Slowly add 2 ml of buffer B to each well, then remove buffer B using a 
vacuum aspirator. This will remove cells not attached to the coverslip 
and leave a monolayer of immobilized cells. 

19. Slowly add 2 ml of 70% ethanol of each well. 

20. Store cells for at least 3 h at — 20 °C. Cells can be stored at — 20 °C for 
at least 6 months. 

• Prepare poly -L-ly sing coverslips 

Carefully put one box of 18 mm round coverslips into 500 ml 0.1 N HC1 
and boil for 10 min. Rinse extensively with H 2 0, autoclave and store in 
70% ethanol. 

To coat coverslips with poly-L-lysine, place 100 fA of a 0.01% (w/v) poly-L- 
lysine solution onto a coverslip, incubate at room temperature for 5 min, 
remove the solution using a vacuum pump and let the remaining liquid 
dry. Then wash twice with H 2 and let air dry. The poly-L-lysine coated 
coverslips can be stored for several months. 



650 Daniel Zenklusen and Robert H. Singer 




5. Hybridization 

Only very low probe concentrations are needed in the hybridization 
reaction to allow single mRNA detection. Generally, 0.5 ng per probe per 
hybridization reaction is sufficient. To block nonspecific binding of the 
probes, competitor DNA and RNA is added in large excess to the hybri- 
dization solution. 

The formamide concentration in the hybridization mix and the 
subsequent wash steps is critical to get optimal hybridization specificity. 
Generally, we use 40% formamide for standard probes (50 nt/50% CG), but 
if high background is observed, increasing the formamide concentration 
from 40% to 50% can reduce background. To detect the entire pool of 
polyA, mRNAs in the cell can be detected using a 50-nt poly-dT probe, but 
the formamide concentration has to be reduced to 15%. 

For the hybridization step, the coverslip with the immobilized cells are 
inverted onto a droplet of the hybridization solution. Floating of the 
coverslip on the hybridization solution leads to even distribution of hybri- 
dization solution and the best results. This works much better than using 
multiwell microscope slides. Hybridization is done in hybridization cham- 
ber overnight at 37 °C. The chamber is a simple, self-assembled unit 
consisting of a glass plate and two Parafilm layers separated by cardboard 
spacers (Fig. 26.2). 

After hybridization, the coverslips are placed back into a 12-well plates 
and washed extensively to ensure that all unbound probes are removed. 
After a short wash in a DAPI containing solution, cells are mounted and are 
ready to be imaged. 



5.1. Materials 

• Glass plate, about 20 X 20 cm 

• Parafilm 

• Cardboard spacers 

• 12-well cell culture plates 

• Glass slide 

Solutions to be prepared: 

40% formamide/2 X SSC 

2x SSC/0.1% Triton X-100 

lx SSC 

lx PBS 

Solution F (40% formamide, 2x SSC, 10 mMNaHP0 4 , pH 7.5) 

Solution H (2x SSC, 2 mg/ml BSA, 10 mMVRC) 
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Cardboard spacer T nver t coverslip with cells facing down 

(on the hybridization mix) 




Hybridization mix 
Paraiilm 1 
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Figure 26.2 Hybridization chamber. The hybridization chamber is assembled using a 
glass plate, Parafilm and cardboard spacers. The coverslips with cells are inverted onto a 
drop of the hybridization solution placed onto the first Parafirm layer. To seal the 
chamber, a second layer of Parafilm is placed on top of the coverslips. To keep the 
second Parafilm layer from touching the coverslips, cardboard spacers are placed on 
both sides and in the middle of the first Parafilm layer. The interior volume of the 
chamber is small and evaporation is not a problem at 37 °C. However, the two layers of 
Parafilm have to be properly sealed to prevent evaporation. 



Escherichia coli tRNA (Roche # 10 109 541 001) 

ssDNA (deoxyribonucleic acid, single stranded from salmon testes, Sigma 

#D9156) 

DAPI solution (0.5 jUg/ml DAPI (Sigma #D9564) in lx PBS. Store at 

4 °C in the dark) 

Mounting solution (ProLong Gold antifade reagent (Invitrogen # 

P36934)) 



5.2. Probes used for the hybridization shown in Figs. 26.1 
and 26.3 

Bold Ts represent amino modified bases 

MDN1 probes (Cy3) 

MDN1-794 TTT GTC GTG GAT AGT GTG GAC CTT AGG 

GAC GAT AAC GCC ACA GAT TGA CG 
MDN1-860 CTC CCG AGT TGA CGA AGA GAG GAA ACC 

GTT TTA TGA GTA GGG ACA AAG GTT 

{continued) 
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(continued) 

MDN1-1104 CTA TAA GTA CCC ATC TCC CTT CTT TGA 

CCG CGG TAG CGA GAA CAC CAG CTC 

MDN1-1210 TTT GCA GCC TTT ACA GTC TCT CCT CTG 

GAT GGA ATG GTT AGT TCG CGC TT 

CCW12 probes (Cy3.5) 

CCW12-59 GGT GAC CAA AGT GGT AGA TTC TTG GCT 

GAC AGT AGC AGT GGT AAC GTT AG 
CCW12-140 GTC ATC GAC GGT GAC GGT AGC GGT 

GGA AAC CAA AGC TGG GGA GAC AGT TT 
CCW12-191 CTT TGG GGC TTC AGT GGT CAA TGG GCA 

CCA GGT GGT GTA TTG AGT GAT AA 
CCW12-245 GGT GTT CTT TGG AGC TTC AGT AGA GGT 

AAC TGG AGC AGC AGT AGA AGT AC 
rRNA-ITS2 (Cy5) 
ITS2-1 ATA GGC CAG CAA TTT CAA GTT AAC TCC 

AAA GAG TAT CAC TC 



5.3. Protocol 

1. Remove the ethanol from the 12- well plate using a vacuum pump and 
rehydrate samples by adding 2 ml 2x SSC at RT for 5 min. Do this 
twice. 

2. Wash cells once with 40% formamide/2x SSC at RT for 5 min. 

During washes, prepare the hybridization mix: 

3. M/x 0.5 ng of each probe per coverslip with 10 jag o£E. coli tRNA and 
10 fig of ssDNA (2 ng of probe mix when using four probes against one 
gene) . 

4. Lyophilize using a SpeedVac. 

5. Add 12 [il of solution F to probe tube, heat at 95 °C for 3 min. 

6. Add 12 /il of solution H to the hybridization mix. 

7. Put a drop of 22 jA of hybridization mix onto the Parafilm stretched out 
on a glass plate. Avoid bubbles in the hybridization mix. (Use the back 
of a forceps to scratch the edges of the Parafilm so that the Parafilm 
sticks to the glass plate.) 

8. Using forceps, place the coverslip with cells facing down on the 
hybridization mix. No bubbles should form. Multiple coverslips can 
be placed next to each other onto a single glass plate, but leave about 
1.5 cm space between coverslips. 

9. To seal the "hybridization chamber," place two cardboard spacers 
(2—3 mm thick and 5 X 0.5 cm in length) on opposite sides of the 
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glass plate over the Parafilm plus a 0.5 X 0.5 cm place onto the centre 
of the plate. Cover the glass plate with a second layer of Parafilm, 
without touching the coverslips. Seal the two layers of Parafilm using 
the back of the forceps to avoid evaporation. Cover with aluminum 
foil. 

10. Incubate at 37 °C over night in the dark. 

11. Preheat 40% formamide/2x SSC at 37 °C, put 2 ml in 12-well tissue 
culture dish. 

12. Place cover slips back in 12-well tissue culture dish containing 40% 
formamide/2x SSC, cells facing up; incubate 15 min at 37 °C 
(incubator) . 

13. Wash once more with 40% formamide/2 x SSC at 37 °C (2 ml, 15 min). 

14. Wash once with 2x SSC 0.1% Triton X-100 at RT (2 ml, 15 min). 

15. Wash once with 1 X SSC at RT (2 ml, 15 min). 

16. Wash coverslip in 1 X PBS plus DAPI (2 ml, 2 min). 

17. Wash lx with 1 X PBS (2 ml, 2 min). 

18. Before mounting, dip coverslip in 100% EtOH, let them dry. 

19. Invert cells facing down onto a drop of mounting solution placed on a 
glass slide. Allow the mounting solution to polymerize over night at 
room temperature in the dark. 

20. Seal coverslips with nail polish. Let nail polish dry before imaging, 
otherwise the objective may be damaged. 

21. Go to the microscope and enjoy your images. 

Slides can be stored at 4 °C for a few days and at — 20 °C for months in 
the dark. 




6. Image Acquisition 

The need for sensitive imaging equipment was likely one reason why 
single molecule detection was not approachable in the past. However, since 
sensitive CCD cameras have become a standard component of most micro- 
scopes and dyes are very bright and photostable, signal intensities are not a 
limiting factor for detection of single mRNAs by FISH. Most epifluores- 
cence microscope setups in imaging facilities are sensitive enough to detect 
single mRNAs. We use a standard epifluorescent microscope and CCD 
camera (described below). 

When simultaneously imaging mRNAs expressed from multiple genes 
using probes labeled with different fluorophores, it is crucial to use the 
correct filter sets to avoid bleedthrough between the different channels. For 
example, when using Cy3 and Cy3.5, whose absorbance and emission are 
relatively close to each other (550/570 nm and 581/596 nm) narrow band 
pass filter sets have to be used. Appropriate filter sets are listed below. 
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To obtain expression profiles and mRNA distributions, images have to 
be acquired in 3D. Using a 100 X objective, collect z-slices every 200 nm. 
Using the setup presented below, exposure times of 1 s per z-stack should 
lead to sufficient signal. If single mRNAs cannot be detected, it is likely that 
the hybridization did not work or the microscope is not aligned properly. 

6.1. Microscope (example) 

• Olympus BX61 epifluorescence microscope (Olympus, Center Valley, 
PA) 

• Olympus UPlanApo 100 X, 1.35 NA oil-immersion objective 

• Olympus U-DICTHC Nomarski prism for DIC 

• Chroma Filters 31000 (DAPI), 41001 (FITC), SP-102vl (Cy3), SP- 
103vl (Cy3.5), and CP-104 (Cy5) (Chroma Technology, Rockingham, 
VT) 

• Light source X-Cite 120 PC (EXFO, Mississauga, ON) 

• CoolSNAP HQ camera (Photometries, Tucson, AZ) 




7. Image Analysis 

Hybridizing four to five FISH probes, each labeled with five fluores- 
cent dyes to an mRNA creates a strong fluorescent signal. Although barely 
visible by eye, single mRNAs are easily detected using a standard CCD 
camera. Single mRNAs signals appear as diffraction limited spots within the 
cell. Sites of transcription often show higher signal intensities and are easily 
distinguishable as they colocalize with the DAPI signal (Figs. 26.1 and 26.3). 
MDN1 transcription sites are visible by eye and being able to see a MDN1 
transcription site by eye is a good first indicator for a successful FISH 
experiment. 

To simplify the data analysis, it is often helpful to reduce the 3D dataset 
to a 2D image using a maximum projection. The maximum projection 
displays the maximum value of all images in the z-stack for particular pixel 
locations and creates a 2D image. As mRNAs for most genes are expressed 
at low numbers, the probability that two mRNAs are found in the same x—y 
but a different z position is low, allowing a reduction to 2D to accurately 
represent the 3D dataset. 

To test for specificity of the signal, probes can be hybridized to control 
cells not expressing the transcript of interest, for example, a deletion strain. 
Alternatively, a gene can be put under an inducible promoter, like a GAL 
promoter and transcription turned off long enough that all mRNAs are 
degraded. Using well-labeled probes and high hybridization efficiency, the 
difference in signal between cells expressing and not expressing is generally 
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Figure 26.3 Quantifying mRNA signal intensities. Intensity of a single mRNA can be 
calculated by determining the fluorescence intensity emitted from a single probe. 
(A) FISH probes hybridizing to the 5' end of MDN1 mRNA were used for hybridiza- 
tion shown in (B). A small number of single probes tend to hybridize unspecifically to 
the cells and can be visualized by changing the contrast levels (B, compare left and 
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obvious. Spots are observed throughout the cell when a gene is expressed 
and no signal should be observed in cells where the gene is not expressed. 

However, single molecule resolution FISH is not completely devoid of 
background (Fig. 26.3). When analyzing the images carefully, low-intensity 
signals are found in the negative control. The weak signals originate from 
single FISH probes sticking nonspecifically to the cell. Despite sequence 
specificity and stringent hybridization and washing conditions, a low num- 
ber of single probes will usually stick to the cell. Their signal intensity is low, 
and they appear as weaker diffraction limited spots compared to the signal 
emitted from an mRNA. These signals can be distinguished from an 
mRNA signal. In most cases, the difference is obvious, mRNA signals are 
bright and single probe signals are low. However, sometimes this difference 
is not so evident. Distinguishing between background-sticking and real 
mRNA signal particularly becomes an issue if hybridization efficiency of 
the probes is low. In this case, some mRNAs will only have one, two, or 
three out of four possible probes bound, resulting in signal with variable 
intensities for different mRNAs within a single cell. When two or less 
probes are bound, the signal becomes more difficult to separate from a 
single probe nonspecifically bound to the cell. 

There are two ways to determine the signal intensity of a single probe 
and to distinguish them from mRNA signals. The first uses a rough approx- 
imation of the signal intensities of single probes. Similar to nonspecifically 
sticking to cells, a small number of probes will also stick to the glass surface 
outside of the cells. When using well-labeled probes, their signal intensity is 
homogenous and they are easily distinguishable from other "junk" on the 
glass. Use image acquisition software to determine the brightest pixel of 
each spot. Signal intensities as low as signals from spots on the glass slide 
indicate background, while higher intensity signals originate from mRNA 
signals. However, it is important to notice that using this method, the 
autofluorescence from the cell, although usually low, is added to the signal 
emitted from a single probe within a cell but not the one from the glass. 
Therefore, using the intensity of a single probe from the glass background 
will underestimate the signals expected from mRNAs inside the cell. This 
method is simple, although only approximative in distinguishing back- 
ground spots from real signals. 

A better and more quantitative approach is to determine the exact signal 
intensity emitted from each mRNA. Different spot detection and 



middle panel). The intensity is determined using a spot detection program. (C) Signal 
intensity of each spot corresponding to a single DNA probe is shown in black, signal 
intensities of single mRNA and sites of transcription are shown in red. Consistent with 
the four probes used in the hybridization (A), intensity of single mRNA signals in the 
cytoplasm is four times the intensity of a single probe (D, E). Nascent mRNAs at the site 
of transcription are two and three times the intensity of a single mRNA in the cytoplasm 
(Zenklusen et ah, 2008). 
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quantification algorithms exist and one of the most established methods 
determines the signal intensity emitted from a diffraction limited spot by 
fitting a 2D Gaussian mask over each spot (Thompson et ah, 2002). We have 
developed custom software to apply this algorithm, which also takes into 
account a background correction and can be found at http://www. 
singerlab.org (Zenklusen et ah, 2008). Shown in Fig. 26.3 are two cells 
hybridized with four probes to the 5 f of the MDN1 mRNA. The spot 
detection program identifies 18 spots. Spots containing a single or four 
probes can easily be distinguished from each other. Single probes intensities 
are around 230 a.u. and mRNA signals show a mean intensity of 996 a.u., 
four times the intensity of a single probe. This illustrates how signals of 
nonspecific probe binding can be distinguished from signals of probes 
hybridized to mRNA molecules. Determining the intensity of single probes 
also allows to establish the signal intensity that is expected from an efficient 
hybridization and thereby allows to determine hybridization efficiency. 

Figure 26.3 furthermore illustrates why achieving high hybridization 
efficiency is crucial. Low hybridization efficiency will lead to datasets that 
are difficult to analyze, as a clear distinction between signal and background 
is not possible. When signal intensity of individual mRNAs is highly 
heterogeneous, it is best to repeat the hybridization to obtain more uniform 
signals. For some probe sets, efficient hybridization can not be achieved and 
new probes against different regions in a gene will have to be synthesized. 

The ability to determine the intensity of a single mRNA also allows 
calculation of the number of nascent mRNAs at the site of transcription. 
Dividing the signal intensity of the two spots colocalizing with the DAPI 
signal in Fig. 26.3 shows that two respectively three nascent mRNAs are 
present on the MDN1 genes. Determining the number of nascent tran- 
scripts is a measure of polymerase loading and therefore the most direct 
assessment for transcriptional activity on a single gene. Importantly, to 
determine polymerase density on a gene, probes hybridizing to the 5' end 
of the mRNAs have to be used. 

Quantification of signals from highly expressed genes is more difficult. 
As shown in Fig. 26.1, CCW12 is highly expressed and individual mRNAs 
overlap each other so that it is not possible to determine the intensity of 
every single mRNA. Therefore, this technique is better suited to study 
genes expressed at low levels. 




8. Summary and Perspectives 

Single molecule resolution FISH is a powerful tool to study gene 
expression. We have applied it to count single mRNAs and determine 
transcription kinetics, investigate splicing regulation, and study mRNA 
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localization. However, its potential applications are even broader. There are 
many aspects of gene expression regulation where using single molecule 
resolution FISH will be a useful tool because it is able to detect and count 
every individual mRNA molecule in a cell. Even if expressed at only one 
molecule per cell, mRNAs can be detected and the precise location within 
the cell can be determined. Studies of transcription networks as well as more 
classical gene expression processes like mRNA export and degradation can 
be analyzed with greater detail using single molecule methodologies. The 
ability to detect single molecules will expand our understanding of these 
cellular processes. 
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Abstract 

Approximately one-third of all newly translated polypeptides interact with the 
endoplasmic reticulum (ER), an event that is essential to target these nascent 
proteins to distinct compartments within the cell or to the extracellular milieu. 
Thus, the ER houses molecular chaperones that augment the folding of this 
diverse group of macromolecules. The ER also houses the enzymes that catalyze 
a multitude of posttranslational modifications. If, however, proteins misfold or are 
improperly modified in the ER they are proteolyzed via a process known as ER- 
associated degradation (ERAD). During ERAD, substrates are selected by molecu- 
lar chaperones and chaperone-like proteins. They are then delivered to the 
cytoplasmic proteasome and hydrolyzed. In most cases, delivery and protea- 
some-targeting require the covalent attachment of ubiquitin. The discovery and 
underlying mechanisms of the ERAD pathway have been aided by the develop- 
ment of in vitro assays that employ components derived from the yeast, 
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Saccharomyces cerevisiae. These assays recapitulate the selection of ERAD sub- 
strates, the "retrotranslocation" of selected polypeptides from the ER into the 
cytoplasm, and the proteasome-mediated degradation of the substrate. The ubi- 
quitination of integral membrane ERAD substrates has also been reconstituted. 




1. Introduction 

Cells are continuously faced with various forms of stress, including 
altered temperature, limited nutrient availability, changes in osmotic pres- 
sure, and the presence of toxic agents. To surmount such challenges, 
adaptive pathways are triggered that induce the synthesis of proteins that 
lessen the effects of cell stress. In model eukaryotes, such as the yeast 
Saccharomyces cerevisiae, many of the stress-induced adaptive pathways have 
been defined, thanks to the multitude of available genetic, genomic, and 
biochemical tools. 

Stress-responsive pathways can also be triggered from within intracellu- 
lar compartments. One compartment in which this has been examined in 
detail is the endoplasmic reticulum (ER), which is the first organelle 
encountered by newly synthesized secreted proteins. In S. cerevisaie, nascent 
secreted proteins can translocate into the ER either during or soon after 
translation (Cross et al, 2009; Rapoport et al, 1999). Translocation is 
facilitated by a multiprotein complex that resides at the ER membrane. 
The key component of this complex is an aqueous translocation pore, and 
the pore and its associated partners have been termed the "translocon" 
(Schnell and Hebert, 2003). 

Concomitant with translocation, most secreted proteins are posttransla- 
tionally modified and begin to fold, which explains why the ER is stocked 
with molecular chaperones and enzymes that catalyze protein folding 
(Vembar and Brodsky, 2008). Because the acquisition of the native or 
near-native state is a prerequisite for the subsequent delivery of secreted 
proteins to their ultimate destinations, the ER also contains a rich variety of 
factors that monitor protein folding. Many of these "quality control" factors 
are molecular chaperones. In the event that proteins fail to fold, either as a 
result of the stresses noted above or due to genetic or stochastic errors, an 
ER stress response, known as the unfolded protein response (UPR), is 
initiated. One outcome of the UPR is an increase in the machinery that 
destroys aberrant secreted proteins, thus clearing the ER of potentially toxic 
protein conformers (Jonikas et al, 2009; Travers et al, 2000). 

For many years, the existence of a quality control protease was sought 
(Vembar and Brodsky, 2008). Early data from mammalian cell systems 
suggested that the protease resided within the ER, and a candidate protease 
was eventually purified (Otsu et al, 1995). However, parallel studies 
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suggested the existence of an alternate system to degrade aberrant proteins 
that had entered the ER. In short, evidence emerged that misfolded ER 
proteins might employ the services performed by the cytoplasmic 
ubiquitin— protesome system (UPS). The proteasome is a large (26S), multi- 
catalytic enzyme that binds and unfolds proteins and then processively 
degrades substrates to short peptides (Hanna and Finley, 2007; Pickart and 
Cohen, 2004). Nearly all proteasome-targeted substrates are modified with 
poly-ubiquitin, which facilitates proteasome-capture. Early evidence indi- 
cated that the UPS proteolyzes misfolded, integral membrane proteins in 
the ER of yeast (Hampton et ah, 1996; Sommer and Jentsch, 1993) and 
mammalian (Jensen et ah, 1995; Ward et ah, 1995) cells. These substrates 
were multispanning membrane proteins, which by definition contained 
cytoplasmic polypeptide loops; therefore, it made sense that the cytoplas- 
mically localized UPS might recognize and destroy misfolded membrane 
proteins. These data were also consistent with the established function of 
the UPS in mediating cytoplasmic protein quality control (Sherman and 
Goldberg, 2001). What remained unknown was how soluble misfolded 
proteins within the ER lumen were destroyed. 

To answer this question, we developed an in vitro system that monitored 
the fates of an ER-localized wild-type secreted protein and a secreted protein 
that was unable to acquire N-linked oligosaccharides (McCracken and 
Brodsky, 1996; Werner et ah, 1996). Our establishment of this assay built- 
upon the pioneering in vitro yeast systems developed in the Walter, Blobel, 
Meyer, and Schekman labs that had been co-opted to follow the translocation 
of nascent secreted proteins into the yeast ER (Deshaies and Schekman, 
1989; Hansen et al, 1986; Rothblatt and Meyer, 1986; Waters and Blobel, 
1986). In our system, the wild-type substrate was the alpha mating type 
prepheromone, pre-pro-alpha factor (ppaF). Upon translocating into the 
ER, ppaF is processed by the signal sequence peptidase, generating pro-alpha 
factor (paF). PaF is then triply glycosylated, which generates GpaF (Julius 
et al, 1984). This substrate is competent for ER-exit through the action of 
Golgi-targeted COPII vesicles (Belden and Barlowe, 2001). In contrast, the 
mutated substrate cannot be iV-glycosylated so that paF is the terminal species 
that forms within the ER (Fig. 27.1). These substrates were chosen for the 
following reasons. First, wild type and mutant forms of ppaF posttranslation- 
ally translocate into the ER in vitro (Hansen et al, 1986; Rothblatt and Meyer, 
1986; Waters and Blobel, 1986). This feature allowed for the large-scale 
synthesis and isolation of radiolabeled substrate, which could then be 
aliquotted and stored prior to use. Second, paF appeared to be degraded 
within the yeast secretory pathway (Caplan et al, 1991), indicating the 
existence of a secretory protein quality control system for soluble proteins. 
Finally, ppaF-derivatives were also degraded in mammalian cell lines (Su 
et al, 1993). Thus, a dissection of the paF biogenic pathway might lead to the 
elucidation of a conserved quality control machinery. 
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Fig. 27.1 The early biogenic pathways utilized by wild-type ppaF and AGppaF, a 
soluble ERAD substrate. Upon translocation into the ER, the signal sequence in ppaF is 
liberated and the protein becomes triply glycosylated. The resulting species, 3GpaF, is 
stable in yeast ER-derived microsomes. In contrast, the three sites required for the 
addition of N-linked glycans in AGppaF have been mutated. Thus, AGppaF is con- 
verted into paF, which is an ERAD substrate. 



In 1996, we reported that paF could be selectively exported — or retro- 
translocated — from ER-derived vesicles back to the cytoplasm (McCracken 
and Brodsky, 1996). Once in the cytoplasm, paF was destroyed by the 
proteasome (Werner et al, 1996). However, GpaF, which derived from the 
wild- type precursor (Fig. 27.1), was stable. Based on our results, we named 
this process ER-associated degradation (ERAD) (McCracken and Brodsky, 
1996). In parallel to our efforts, Wolf and colleagues established that another 
mutated secreted protein, CPY*, was also degraded by the proteasome in 
yeast (Hiller et al, 1996). Collectively, these data indicated that integral 
membrane and soluble proteins in the ER were both handled by the UPS 
(Vembar and Brodsky, 2008). In fact, a subsequent modification of our 
assay established that the proteasome was necessary and sufficient to retro- 
translocate and degrade paF (Lee et al, 2004). Moreover, the use of 
ER-derived vesicles, or "microsomes" from mutant strains allowed our 
laboratory and others to identify the ER lumenal chaperones required 
for paF degradation (Brodsky et ah, 1999; Gillece et al, 1999; Kabani 
et al, 2003; Lee et al, 2004; McCracken and Brodsky, 1996; Nishikawa 
et al, 2001). The in vitro system was also employed by Romisch and 
Schekman to provide evidence suggesting that the translocon might serve 
as the conduit for retro translocation (Pilon et ah, 1997). Each of these in vitro 
assays is described in detail, in Section 2. 

Do integral membrane proteins also retrotranslocate and become 
solubilized from the ER membrane prior to proteasome-mediated destruc- 
tion? Data obtained from mammalian cell systems suggested that this might 
be the case. First, Kopito and colleagues reported that cytoplasmic 
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"aggresomes" accumulated in mammalian cells expressing high levels of a 
misfolded membrane protein that were simultaneously challenged with 
proteasome inhibitors (Johnston et ah, 1998). The aggresomes contained 
the misfolded substrate and components of the UPS (Johnston et ah, 1998; 
Wigley et ah, 1999). Second, Ploegh and coworkers reported that a human 
cytomegalovirus gene product catalyzed the "dislocation" of the major 
histocompatibility class I molecule into cytoplasmic fractions in transfected 
cells (Wiertz et ah, 1996). Because these phenomena were only evident in 
mammalian cells, further attempts to elucidate the mechanism of membrane 
protein retrotranslocation have had to rely on pharmacological and RNAi- 
related technologies. 

To better define the pathway by which integral membrane proteins are 
selected and retro translocated for ERAD, we developed an in vitro assay in 
which each step in the degradation pathway could be dissected (Nakatsukasa 
et ah, 2008). The components for this assay — ER-derived microsomes and 
concentrated cytosol — were again isolated from S. cerevisiae. The use of this 
assay led to the following discoveries. First, we observed that Hsp70 and 
Hsp40 molecular chaperones help link ERAD substrates to E3 ubiquitin 
ligases. Second, we found that the E3s required for ERAD exhibit func- 
tional redundancy. Third, we were able to evoke the ATP- and cytosol- 
dependent retrotranslocation of a polytopic integral membrane protein into 
the cytosolic fraction. Fourth, we determined that membrane protein 
extraction required the Cdc48p complex, which was previously found to 
play an important role in ERAD (Jentsch and Rumpf, 2007). Fifth, we 
established that a cytoplasmic polyubiquitin extension enzyme, or "E4," 
elongated the polyubiquitin chain on the ERAD substrate and was required 
for maximal rates of substrate degradation. And sixth, we confirmed that the 
solubilized substrate was competent for proteasome-dependent degrada- 
tion. These discoveries were made possible through the use of ER-derived 
microsomes and cytosol that were prepared from yeast containing specific 
loss-of-function and thermosensitive mutant alleles. The assays that led to 
these discoveries are described in Section 3. 




2. In Vitro ERAD Assays Using a Soluble 
Substrate, PaF 

In this section, first, the isolation of the materials required to monitor 
the degradation of paF is described. Next, the assays for paF retrotransloca- 
tion and ERAD are detailed. Throughout the section, comments are added 
to note how reagents prepared from mutant strains have been used to better 
define the ERAD pathway. 
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2.1. Materials 

2.1.1. Microsome preparation 

The preparation of ER- derived microsomes from S. cerevisiae has been well 
documented (Deshaies and Schekman, 1989; Rothblatt and Meyer, 1986) but 
is described in outline form with minor revisions, below. When temperature 
sensitive mutants are employed, the cultures are grown at a permissive tem- 
perature and are then shifted to the restrictive temperature (i.e., 37 °C). 
Although thermosensitive phenotypes can be recapitulated in vitro, the growth 
period at the nonpermissive temperature needed to obtain the mutant defect 
in the assay must be determined empirically and can range from < 20 min to 
5 h (Becker et ah, 1996; Brodsky and Schekman, 1993; Brodsky et ah, 1993; 
Latterich et ah, 1995; Nakatsukasa et ah, 2008). Smaller scale isolations of 
microsomes have also been used in ERAD studies (Nakatsukasa et ah, 2008), 
but the translocation efficiency is somewhat lower (our unpublished data) . 

1. Yeast are grown at the desired temperature in rich medium and with 
vigorous shaking until the culture reaches mid-log to late-log phase 
(optical density at 600 nm [OD 600 ] of 2.0—3.0). We typically grow 
1—2 1 of yeast for a microsome preparation. 

2. The cell walls are digested with a j6-l,3-glucanase hydrolyzing enzyme 
that either can be purified from recombinant Escherichia coli that express 
the enzyme (Shen et al, 1991) or purchased commercially (e.g., Zymo- 
lyase, from MP BioMedicals). 

3. The resulting spheroplasts are collected by centrifugation through 
20 mM HEPES, pH 7.4, 0.8 M sucrose, 1.5% Ficoll 400 at 
6000 rpm, in an HB-6 swinging bucket rotor. It is critical that each 
of the following steps is performed at 4 °C. 

4. The spheroplasts are resuspended to a final OD 600 of 100/ml in 20 mM 
HEPES, pH 7.4, 0.1 M sorbitol, 50 mMKOAc, 2 mMEDTA, and a 
protease inhibitor cocktail, and the solution is transferred to a tight- 
fitting, Teflon-glass homogenizer that can be driven with a motor. 

5. The plasma membrane is then broken by 10 strokes with the motor 
running at the highest setting. 

6. The broken cellular material is layered onto a cushion that contains 
20 mM HEPES, pH 7.4, 1.0 M sucrose, 50 mMKOAc, 1 mMDTT. 

7. After centrifugation at 6500 rpm in an HB-6 swinging bucket rotor, 
the crude microsomal fraction is resuspended in an equal volume of 
B88 (20 mM HEPES, pH 6.8, 250 mM sorbitol, 150 mM KOAc, 
5 mM MgO Ac) . 

8. The microsomes are collected by centrifugation at ~ 15,000 x^ for 
10 min, resuspended in B88, and recentrifuged. 

9. After final resuspension in a small volume of B88, the microsome concen- 
tration is adjusted such that the OD 2 so should equal ~ 40 when a small 
aliquot of a 1:10 dilution of the resuspended material is assessed in 2% SDS. 

10. Microsomes aliquots (~50 fi\) are frozen and stored at —80 °C. 
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2.1.2. Isolation of AGppaF, the precursor of a soluble ERAD 
substrate, and ppaF, the wild-type control 

The substrates required for the in vitro ERAD assay are the wild-type ppaF 
control (which is encoded by the MFccl locus; Kurjan and Herskowitz, 
1982) and a form of ppaF that contains site-directed mutations in the three 
sites required for the addition of N-linked oligosaccharides (Caplan et al, 
1991). We denote this mutant as AGppaF. As described above and in 
Fig. 27.1, the signal sequence of ppaF is removed in the ER and the 
resulting species becomes triply glycosylated, ultimately forming 3GpaF in 
the microsomes. In contrast, AGppaF is converted to paF, which is an 
ERAD substrate in the yeast microsome-based system (McCracken and 
Brodsky, 1996; Werner et al, 1996). PaF has also been prepared with a 
fluorescent tag and is retrotranslocation competent after its entrapment in 
dog pancreas microsomes (Wahlman et al, 2007). 

Using SP6 polymerase, wild-type ppaF is transcribed from plasmid pDJIOO 
and AGppaF is transcribed from plasmid pGEM2alpha36. We next isolate the 
messages encoding ppaF and AGppaF and perform an in vitro translation 
reaction in the presence of S-methionine and concentrated, gel-purified 
yeast lysate to obtain radiolabeled ppaF and AGppaF (Lee et ah, 2004; 
McCracken and Brodsky, 1996; Werner et ah, 1996). The logic underlying 
this protocol is that maximal ppaF translocation efficiency requires factors in 
the yeast lysate — presumably binding stably to ppaF — that are absent in other 
translation-competent lysates (Chirico et ah, 1988; Deshaies et ah, 1988). 

More recently, we have employed the Promega TnT SP6 Coupled 
Reticulocyte Lysate System to synthesize radiolabeled ppaF and AGppaF 
and other substrates (Hrizo et ah, 2007), and have discovered that the 
resulting products translocate efficiently into yeast microsomes. In brief, 
each plasmid template is mixed on ice with the commercial buffer, ribonu- 
clease inhibitor, SP6 polymerase, an amino acid mixture (lacking met), and 
the supplied rabbit reticulocyte lysate. The reaction is then supplemented 
with 20 /iCi of S-labeled amino acid (PerkinElmer EXPRE S S Protein 
Labeling Mix). A 50-/il (total volume) reaction is typically performed 
according to the manufacturer's instructions at 30 °C for 90 min. Single- 
use aliquots are then flash frozen and stored at — 80 °C. 



2.1.3. Yeast cytosol 

The preparation of concentrated yeast cytosol using liquid nitrogen was first 
described by Sorger and Pelham (1987) and modified for the ERAD assay as 
described (McCracken and Brodsky, 1996). Strains containing temperature 
sensitive mutants can be used to prepare cytosol but again the conditions 
required to recapitulate temperature sensitive defects in vitro must be deter- 
mined empirically. 
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1 . Yeast cells are grown with vigorous shaking in rich medium to log phase 
(OD 600 = ~2.0) at 30 °C or at the desired alternate temp era ture(s). In 
our experience, at least 6 1 of culture are needed for efficient lysis. 

2. The cells are collected by centrifugation, resuspended in water, recen- 
trifuged, and resuspended in a minimal amount of B88. Typically, we 
use < 5 ml of B88 for the number of cells obtained from a 6 1 yeast 
culture. 

3 . The cells are added slowly to 500 ml of liquid nitrogen in a tripour plastic 
beaker. After the yeast are frozen, the liquid nitrogen is decanted and the 
cells are stored at — 80 °C. 

4. Approximately 500 ml of liquid nitrogen is added to a stainless-steel 
blender, followed by the frozen yeast. The blender is initially turned-on 
at the lowest setting, but the blade speed is soon increased to the highest 
setting. The volume must be maintained above the rotating blades by the 
periodic addition of liquid nitrogen. To prevent spills, the blender must 
be covered as often as possible with a lid that can withstand low 
temperatures (e.g., a thick Styrofoam slab). 

5. After 10 min, the blender is turned-off, the liquid nitrogen is evaporated, 
and the resulting powder is transferred to a 50-ml plastic tube, which can 
be stored at —80 °C. 

6. A minimal amount of B88 (e.g., ~0.5 ml/40 ml of broken yeast) is 
added as the lysate begins to thaw at room temperature or on ice, and 
freshly prepared DTT is added to a final concentration of 1 mM. 

7. The lysate is centrifuged at 10,000 Xg for 10 min at 4 °C, and the 
supernatant is collected and re centrifuged. 

8. The second supernatant is centrifuged at 300,000 Xg for 1 h at 4 °C and is 
aliquoted, frozen in liquid nitrogen, and stored at — 80 °C. A Bio-Rad 
protein assay is used to determine the protein concentration. In our 
experience, cytosols at > 20 mg/ml work best in ERAD assays as long as 
they are not thawed and refrozen. 



2.1.4. ATP regenerating system 

Optimal ERAD efficiency requires an ATP regenerating system. To this 
end, a 10 X stock is made up as follows: 

• lOmMATP 

• 500 mM creatine phosphate 

• 2 mg/ml of creatine phosphokinase 

• B88 to volume 

The solution is then distributed into single-use aliquots, frozen in liquid 
nitrogen, and stored at — 80 °C. Reactions lacking the addition of the ATP 
regenerating system may support a low level of retrotranslocation/ERAD 
and ubiquitination. Thus, to fully decipher the ATP-dependence of the 



ER Protein Quality Control 669 

following reactions, we have performed incubations with either ATPyS 
(Lee et ah, 2004) or apyrase (Nakatsukasa et ah, 2008) in place of the 
regenerating system. 

2.2. The in vitro degradation assay for paF 

Prior to examining the retrotranslocation and degradation efficiencies of 
paF, the precursor to this substrate and the precursor to the wild-type 
control, 3GpaF, must be introduced into ER-derived microsomes through 
an in vitro translocation assay. The resulting microsomes are then reisolated 
to examine the degradation efficiency and retrotranslocation steps during 
the ERAD of a soluble substrate. 

2.2.1. Translocation of ppaF and AGppaF into yeast microsomes 

1. A translocation reaction is set up in a microcentrifuge tube on ice. Most 
commonly, the 60 fA reaction contains 45 fA of B88, 6 fA of the lOx 
ATP regenerating system, 5 fA of yeast microsomes, and 4 fA of 

S-labeled AGppaF or ppaF (or the appropriate volume to obtain 
~ 300,000 cpm per reaction). The reaction is mixed gently and then 
incubated for 1 h in a 20 °C water bath. 

2. Following the incubation, the solution is centrifuged at ~ 16,000 Xg for 
3 min at 4 °C. 

3. After the tubes are placed on ice, the supernatant is removed with a 
gel-loading tip. Care must be taken not to disturb the pellet. 

4. The pellet is gently resuspended in 60 fA of ice-cold B88 and the solution 
is recentrifuged, as above. 

5. The supernatant is again removed and the pellet is taken up in 5 fA of ice- 
cold B88. 



2.2.2. Reconstitute of cytosol- and ATP-dependent degradation 

The following assay has been adapted to assess the contributions of a number 
of ER lumenal and integral membrane components on the ERAD of a 
soluble substrate (Brodsky et ah, 1999; Gillece et ah, 1999; Kabani et ah, 
2003; Lee et ah, 2004; McCracken and Brodsky, 1996; Nishikawa et ah, 
2001; Pilon et ah, 1997). The role of the proteasome during this process can 
be assessed either through the use of cytosol from a proteasome mutant or 
through the addition of proteasome inhibitors (Werner et ah, 1996). More- 
over, the cytosol requirement can be circumvented through the addition of 
purified 26S proteasomes isolated from yeast or mammals (Lee et ah, 2004). 
Negative controls for this experiment include reactions supplemented with 
ATPyS or apyrase, and/or reactions lacking cytosol (Lee et ah, 2004; 
McCracken and Brodsky, 1996; Werner et ah, 1996). 
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To a microcentrifuge tube on ice, with the appropriate amount of ice- 
cold B88 for a final volume of 60 [A, the following reagents are added (in 
order) : 

• 5 jA of microsomes containing S-labeled paF (a product of the translo- 
cation reaction with AGppaF) or 3GpaF (a product of the translocation 
reaction with ppaF), prepared as described above 

• 6 /il of the 1 X ATP regenerating system 

• An appropriate amount of yeast cytosol to obtain a final concentration of 
1—3 mg/ml 

1 . The reaction is incubated at 30 °C for 20 min or at higher temperatures 
if the contributions of some temperature-sensitive mutations will be 
monitored (Brodsky et ah, 1999; Gillece et ah, 1999; Kabani et ah, 2003; 
Lee et at, 2004; Nishikawa et al, 2001; Pilon et al, 1997). Multiple 
reactions can also be set up if a time-course will be conducted. 

2. The reaction tubes are placed on ice and 12 fA of an ice-cold 100% 
TCA stock solution is added. 

3 . The quenched reactions are agitated vigorously on a Vortex mixer for 
~3s and incubated on ice for 15 min. 

4. The solutions are centrifuged at 16,000 Xg for 5 min at 4 °C, and the 
supernatant is removed with a gel-loading tip. 

5. Sufficient ice-cold acetone is added to cover each pellet, and the 
mixture is again briefly agitated on a Vortex mixer and immediately 
recentrifuged. 

6. The acetone is removed with a gel-loading tip and the pellet is air- 
dried for 2—3 min. 

7. The final pellets are resuspended in SDS— PAGE sample buffer by 
repetitive pipetting, and the mixture is incubated for 10 min at 
-70 °C. 

8. The radiolabeled proteins are best resolved through the use of an 18% 
denaturing poly aery lamide gel that also contains 6 M urea. To further 
maximize the separation between the signal sequence-containing (i.e., 
AGppaF, — 18 kDa) and signal sequence-cleaved (i.e., paF, — 16 kDa) 
species, which is an EPJU3 substrate (Fig. 27.2), we use 6 cm gels. Once 
the dye front is near the bottom of the gel, the plates are disassembled, 
the gels are fixed and dried, and the radioactivity is visualized using a 
phosphorimager. A 2- to 3-day exposure is usually sufficient. 

9. To quantify the amount of degradation, we consider only the translo- 
cated paF (i.e., the paF species in the "-cytosol" control at t — min; 
Fig. 27.2). Thus, the amount of paF remaining under conditions that 
promote degradation should be averaged and compared to the average 
amount of material in control reactions that lack ATP and/or cytosol. 
The wild-type substrate (i.e., 3GpaF, —28 kDa) should be stable 
regardless of the assembled reaction conditions. 
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Fig. 27.2 paF is a soluble ERAD substrate. The cytosol- and time-dependent degra- 
dation of paF is shown. Values represent the time (in min) that microsomes containing 
paF were incubated in the presence or absence of cytosol and an ATP-regenerating 
system at 30 °C. AGppaF is membrane associated, untranslocated precursor. 
Figure taken from (McCracken and Brodsky, 1996). © McCracken and Brodsky 
(1996). Originally published in The Journal of Cell Biology. 132: 291-298. 
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Fig. 27.3 paF is retrotranslocated from ER-derived yeast microsomes into the cyto- 
solic fraction. Either AGppaF or ppaF, as indicated, was translocated into microsomes 
and after a 25-min incubation in the presence or absence of cytosol and an ATP- 
regenerating system the reaction was centrifuged to obtain a pellet (P) and supernatant 
(S) fraction. As noted, in one reaction the supernatant fraction was treated with protease 
(trypsin at a final concentration of 0.2 mg/ml for 30 min at 4 °C). AGppaF and ppaF are 
membrane associated, untranslocated precursors, as in Fig. 27.2. Figure taken from 
McCracken and Brodsky (1996). © McCracken and Brodsky (1996). Originally pub- 
lished in The Journal of Cell Biology. 132: 291-298. 

2.3. The paF retrotranslocation assay 

The degradation assay, described above, can be modified to monitor 
the retrotranslocation of paF from yeast ER microsomes (Fig. 27.3). The 
translocation of wild-type ppaF, which forms 3GpaF, serves as a negative 
control (i.e., 3GpaF should remain in the microsome fraction). The inclu- 
sion of ATPyS instead of the ATP-regenerating system also serves as a 
negative control. In addition, the assay can be used to assess retrotransloca- 
tion efficiency upon the addition of purified cytoplasmic proteins, such as 
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the proteasome, the 19S proteasome "cap," Cdc48p, and purified chaper- 
ones (Lee et ah, 2004). Of note, the retro translocated paF is protease 
sensitive, indicating that it is not encapsulated in ER-derived vesicles 
(Fig. 27.3) (McCracken and Brodsky, 1996). 

1. Translocation reactions (60 fi\) are assembled as described above. 

2. The microsomes are harvested, washed, and used in the degradation 
assay containing B88, the ATP-regenerating system, and yeast cytosol, 
as presented in the preceding section. 

3 . The reaction is incubated for 20 min at 30 °C or higher temperatures in 
the event that reagents from temperature sensitive mutant strains are 
being examined. The initial rate of retrotranslocation can also be 
examined by taking earlier time points (e.g., 5 and 10 min). 

4. The microsomes are pelleted in a refrigerated microcentrifuge at 
1 6,000 X£ for 3 min. 

5. The tubes are returned to ice and the supernatant is quickly removed 
with a gel-loading tip and placed in a prechilled fresh tube. The 
supernatant will contain the retro translocated paF. Care must be 
taken not to disturb the pellet. 

6. Twelve microliters of ice-cold 100% TCA is immediately added to the 
supernatant. 

7. This solution is agitated on a Vortex mixer for ~ 3 s and then incubated 
at 4 °C for 15-20 min. 

8. During this time, the pellet is resusp ended in 60 jA B88 and 12 jA of 
100% ice-cold TCA is added. The solution, which contains micro- 
some-retained paF, is also agitated and incubated at 4 °C for 15 min. 

9. After 15—20 min, each of the samples (the supernatant/cytosol and 
pellet/microsomes) is centrifuged at 1 6,000 x^ in a refrigerated micro- 
centrifuge, and the supernatants are removed with a gel-loading tip. 

10. Sufficient ice-cold acetone is added to cover the pellets, and the 
solution is briefly agitated on a Vortex mixer. 

11. The samples are immediately recentrifuged, as above, and the superna- 
tant is removed with a gel loading tip. 

12. The pellets are air-dried for 2—3 min and resuspended in SDS— PAGE 
sample buffer by repetitive pipetting. 

13. The solution is incubated for 10 min at ~70 °C, and the products are 
resolved on 6 M urea— 18% denaturing polyacrylamide gels, as described 
above. 

14. After the gels are fixed and dried, and the radioactive species are 
visualized using a phosphorimager, the amount of paF in the superna- 
tant and pellet are summed to establish the total paF in the reaction. 
Then, the amount of paF in the supernatant is divided by the total paF 
to calculate the percentage of paF retro translocated to the supernatant. 
These values are averaged among triplicate experiments. Most 
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commonly, the amount of paF exported in the negative controls is 
< 10%. For the data shown in Fig. 27.3, 36% of paF was retro translo- 
cated in the presence of cytosol/ATP. 




3. In Vitro Assays for Integral Membrane 
Proteins that are ERAD Substrates 

To better define the ERAD pathway taken by integral membrane 
proteins in the yeast ER, we developed a system in which the selection, 
ubiquitination, retrotranslocation, and degradation of Ste6p*, a mutated 
form of the Ste6p a-mating factor transporter, could be followed. Ste6p* 
was chosen because the genetic requirements for the degradation of this 
integral membrane protein were relatively well defined (Huyer et al, 2004; 
Loayza et al, 1998; Vashist and Ng, 2004). In addition, the mutation in 
STE6 (Q1249X) results in a C-terminally truncated form of the protein; 
therefore, the quality control "decisions" that trigger the destruction of 
Ste6p* are made posttranslationally. In principle, this simplifies the machin- 
ery required for the ERAD of the substrate, and excludes the contribution 
of cotranslational (i.e., ribosome-associated) factors. Finally, Ste6p* is a 
member of the large ABC family of transporters, which includes the cystic 
fibrosis transmembrane conductance regulator (CFTR): The topologies of 
Ste6p* and CFTR are identical and the proteins share domain-restricted 
sequence homology. Previous work had also established many of the 
genetic requirements for the ERAD of CFTR after its heterologous expres- 
sion in yeast (Ahner et al, 2007; Gnann et al, 2004; Youker et al, 2004; 
Zhang et al, 2001). Consequently, microsomes prepared from yeast strains 
expressing either Ste6p* or CFTR can be employed for many of the assays 
described below (Nakatsukasa et al, 2008). 

3.1. In vitro ubiquitination assay 

3.1.1. Isolation of yeast microsomes containing integral membrane 
ERAD substrates 

Yeast expressing HA-tagged versions of Ste6p* (Huyer et al, 2004) or 
CFTR (Zhang et al, 2001, 2002) have been described previously, and 
ER-derived microsomes are prepared from these strains as presented 
above. The only difference is that microsomes are isolated from strains 
expressing the ERAD substrate; therefore, yeast are grown in selective 
media. 

When one wishes to recapitulate a temperature-sensitive mutant defect 
in these assays, a more rapid, smaller scale microsome preparation should be 
used (Nakatsukasa et al, 2008). The small-scale preparation minimizes the 
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elapsed time between the in vivo temperature shift and the use of the isolated 
reagents in the following assays. An appropriate negative control for each 
reaction is the preparation of yeast microsomes from a strain lacking the 
plasmid for the expression of the ERAD substrate, but that instead contains 
the expression plasmid without an insert. 

3.1.2. Yeast cytosol 

The cytosol required for the following experiments is prepared as described in 
Section 2.1. When the phenotypes associated with temperature-sensitive 
mutants are to be recapitulated, the in vivo shift to the nonpermissive tempera- 
ture must be determined empirically, and can range substantially (also see 
above). 

3.1.3. Preparation of 125 l-labeled ubiquitin 

Bovine ubiquitin (Sigma) is dissolved in phosphate-buffered saline at a final 
concentration of 10 fig/ fA. The protein is then labeled with I (NEN 
Research, BioRad) using the ICL method (Helmkamp et al, 1960; 
McFarlane, 1958). The labeled ubiquitin is enriched and unincorporated 
I is removed with a D-salt Excellulose Desalting column (Pierce). The 
final, isolated product is stored in 20 fA aliquots at —80 °C at a final 
concentration of 0.2 fig/ fA (~ 1.0x10 cpm/fA). The reagent must be 
used within 2 months after its preparation. 

3.1.4. The ubiquitination reaction 

The following reagents are combined into the appropriate volume of B88 
(total volume, 18 fA) on ice: 

• 2 fA of yeast microsomes containing the HA-epitope-tagged integral 
membrane ERAD substrate 

• 2 fA of the 10 X ATP regenerating system (see Section 2.1) 

• Sufficient yeast cytosol to achieve a final concentration of 1—4 mg/ml 

The reaction can also be supplemented with apyrase, which serves as a 
"-ATP" control, or methylated ubiquitin, which inhibits ubiquitin chain 
extension (Hershko and Heller, 1985) (Fig. 27.4). As noted above, micro- 
somes lacking the ERAD substrate serve as another negative control, as can 
reactions lacking cytosol. 

1. The mixture is preincubated in a 23 °C water bath for 10 min before 

IOC 

2 fA of I-labeled ubiquitin are added. 

2. The incubation is then continued for up to 1 h at 23 °C. 

3. At the desired time point, 80 fA of an SDS stop solution (50 mMTris— 
CI, pH 7.4, 150 mMNaCl, 5 mMEDTA, 1.25% SDS, 1 mMPMSF, 
1 /ig/ml leupeptin, 0.5 fig/ml pepstatin A, 10 mMNEM) are added. 



ER Protein Quality Control 



675 



(min) 30 60 







< 


o 
o 


<D 


* 

Oh 


-m 


O 


^o 


>> 




a> 


u 

1 




O 


[ 



H 





< 


^5 


ffi 


CD 


• i-H 

< 


+ 


i 



kDa 

250- 



\ 



150- 



100 — 



75 



50 - 




r^ 


CN 


CJ 


co 


C/0 


— 


s 





Q 


§ 


+ 


+ 









IP: HA 

125 I-Ub 




1 



7 



8 



10 11 



Fig. 27.4 Ste6p* is polyubiquitinated in vitro. Microsomes containing an HA-tagged 
form of Ste6p* were incubated with 2 mg/ml cytosol, an ATP-regenerating system, and 
I-labeled ubiquitin at 23 °C or on ice for the indicated times (lanes 1-3) or for 60 min 
(lanes 4-11). The reactions were then quenched and Ste6p* was immunoprecipitated 
with an anti-HA antibody. Samples were processed as described in the text. Where 
indicated, reactions either lacked cytosol, or were treated with apyrase (final concen- 
tration of 0.02 /ig/jul), 0.5 mg/ml methylated-ubiquitin, or 100 fiM MG132, a protea- 
some inhibitor. "-anti-HA" denotes a precipitation performed in the absence of 
antibody and "-Ste6p*" denotes that microsomes were prepared from cells lacking 
the substrate. Figure taken from Nakatsukasa et al. (2008). 



4. The solution is briefly agitated on a Vortex mixer and then incubated at 
37 °C for 30 min. 

5. In preparation for an immunoprecipitation, 400 fA of 50 mM Tris— CI, 
pH 7.4, 150 mMNaCl, 5 mMEDTA, 2% Triton X-100 is added and 
the solution is mixed gently and placed on ice. 

6. A 2 fA aliquot (~ 10 fig) of anti-HA antibody (Roche) is added and the 
immunoprecipitation reaction is incubated overnight at 4 °C with mild 
agitation. 

7. A 30 [A, 50% (v/v) suspension of Protein A-Sepharose (GE Health- 
care), equilibrated in 50 mM Tris-Cl, pH 7.4, 150 mM NaCl, 5 mM 
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EDTA, 1 mMazide, is added and the mixture is incubated at 4 °C for 
another 2—3 h. 

8. The beads are harvested by a 10 s, low-speed centrifugation at 4 °C and 
are then washed four times with 800 fA of ice-cold 50 mMTris— CI, pH 
7.4, 150 mM NaCl, 5 mM EDTA, 1% Triton X-100, 0.2% SDS, 
lOmMNEM. 

9. After the final wash, the residual buffer is removed with a gel-loading 
tip and 30 fA of 2x SDS— PAGE sample buffer are added. 

10. The bound proteins are eluted by a 37 °C, 30 min incubation, and the 
supernatant is split after a brief centrifugation. 

1 1 . One-half of the supernatant (to detect I-ubiquitin-modified pro- 
tein) is analyzed using a 6 cm, 6% (denaturing) SDS— polyacrylamide gel. 
After fixation and drying, the gel is exposed to a phosphorimager plate. 

12. The other half of the sample is analyzed first by SDS— PAGE but the gel 
is then blotted and used to detect the amount of unmodified Ste6p* 
with anti-HA antibody and the appropriate secondary antibody. We 
use the SuperSignal West Pico Chemiluminescent Substrate (Thermo 
Scientific) to visualize the signal. 



3.2. Analysis of integral membrane protein retrotranslocation 

The Cdc48p complex-dependent retrotranslocation of ubiquitinated 
Ste6p* can be followed by inserting a centrifugation step into the protocol 
described above. The involvement of the Cdc48p complex in this event was 
established through the use of cytosols prepared from a cdc48-3 strain that 
had been shifted to 37 °C for 5 h and from an npl4A strain (Nakatsukasa 
et al, 2008). Therefore, these materials serve as a negative control in the 
following experiments. A TAP-tagged version of Cdc48p was also shown to 
coprecipitate the retro translocated species (Nakatsukasa et al, 2008), further 
implicating the Cdc48p complex in Ste6p* retrotranslocation. 

1. Ubiquitination reactions are set up as described in Section 3.1, except 
that a final volume of 25 fA is achieved. 

2. At the completion of the 1 h 23 °C incubation, the microsomes are 
pelleted in a refrigerated microcentrifuge at 1 8,000 x^ for 10 min. 

3. The reaction tube is returned to ice and the supernatant (~ 20 fA), which 
contains the retrotranslocated, ubiquitinated Ste6p*, is removed with a 
gel-loading tip and placed in a new microcentrifuge tube on ice. 

4. The pelleted microsomes are resuspended in 25 fA of ice-cold B88, and 
20 fA of this suspension is placed in a new tube. 

5. To analyze the amount of ubiquitinated Ste6p* in the supernatant 
(cytosol) and pellet (microsome) fractions, 80 fA of the SDS stop solution 
(see above) is added to the supernatant and resupspended microsomes, 
and the mixtures are briefly agitated on a Vortex mixer. 
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6. As above, the solution is incubated at 37 °C for 30 min, and then 400 fA 
of 50 mMTris-Cl, pH 7.4, 150 mMNaCl, 5 mMEDTA, 2% Triton X- 
100 is added and placed on ice. 

7. The Ste6p* in both fractions is immunoprecipitated with anti-HA 
antibody /Protein A-Sepharose and analyzed under denaturing condi- 
tions via SDS-PAGE. 
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Abstract 

A range of methods for transforming organisms with nucleic acids has been 
established. However, techniques for introducing proteins, or particularly pro- 
tein aggregates, into cells are less developed. Here, we introduce a highly 
efficient protocol for introducing protein aggregates such as prions into yeast. 
The protein transformation protocol allows one to infect yeast with amyloid 
fibers of recombinant fragments (Sup-NM) of Sup35p, the protein determinant 
of the yeast prion state [PS/ + ], or in vivo Sup35p prions. Infectivity is dependent 
on the concentration of Sup-NM fibers and approaches approximately 100% at 
high Sup-NM concentrations. We also describe a method to create distinct 
conformations of Sup-NM amyloids. Using the protein transformation protocol, 
infection of yeast with different Sup-NM amyloid conformations leads to distinct 
[PSI + ] strains. This protein transformation procedure is readily adaptable to 
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other prion proteins and makes it possible to bridge in vitro and in vivo studies 
and greatly helps to elucidate the principles of prion inheritance. 




1. Introduction 

Like mammalian prions, cytoplasmic genetic elements such as [PSIj 
and [URE3] are observed in yeast. Their responsible proteins, Sup35p and 
Ure2p, are called "yeast prions" (Wickner, 1994). Although mammalian 
and yeast prion proteins are unrelated in amino acid sequence, their prion 
states share common structural features such as /J-sheet rich fibrillar aggre- 
gates, amyloids. The facile genetics of yeast together with the ability to 
create de novo infectious forms of proteins from pure material have greatly 
helped to elucidate the structural and mechanistic basis of prion inheritance 
as well as the role of cellular factors in facilitating and inhibiting prion 
replication. Here, we describe a procedure for generating distinct, self- 
propagating amyloid conformations of a prionogenic Sup35p fragment 
termed Sup-NM and a highly efficient protocol for transforming yeast 
cells with in vitro amyloid fibers or in vivo prion particles of Sup35p, an 
essential translation termination factor (Tanaka et al, 2004). This protocol 
can be readily adapted to other proteins and as such represents a general tool 
for studying yeast prions (Alberti et al, 2009; Brachmann et al, 2005; Patel 
and Liebman, 2007; Patel et al, 2009). 

Sup35p forms self-propagating aggregates and leads to a nonsense sup- 
pression phenotype, resulting in a [P&r ] prion state (Chien et al, 2004; 
Tessier and Lindquist, 2009; Tuite and Cox, 2003). In yeast containing a 
nonsense mutation in the adel gene, [P&r ] colonies are white or pink and 
grow on media lacking adenine, while nonprion \psi~] colonies are red and 
require adenine (Chernoff et al, 1995). [PSi ] propagation is mediated by 
an N-terminal Gln/Asn-rich sequence and, to a lesser extent, by a highly 
charged middle (M) domain (Bradley and Liebman, 2004; DePace et al, 
1998; Glover et al, 1997; Liu et al, 2002; Ter-Avanesyan et al, 1994). 
Transient overexpression of the Sup-NM fragment (residues 1—254) leads to 
protein aggregation and de novo appearance of [P&r ] . In vitro, Sup-NM is 
shown to be sufficient to form self-seeding amyloid fibers (Glover et al, 
1997; King et al, 1997). Like mammalian prions and other yeast prions, 
[PS.r ] exhibits a range of heritable phenotypic strain variants (Derkatch 
et al, 1996). These strains differ in mitotic stability (Derkatch et al, 1996), 
dependence on the cellular chaperone machinery (Kushnirov et al, 2000b), 
solubility and activity of Sup35 (Derkatch et al, 1996; Kochneva- 
Pervukhova et al, 2001; Uptain et al, 2001; Zhou et al, 1999). These 
differences in strain variants lead to distinct adel color phenotype as well as 
efficiency and specificity of prion transmission. 



Protein Transformation Protocol in Yeast 683 



To reconcile the existence of strains with the "protein-only" hypothesis 
of prion transmission, it has been proposed that a single protein can misfold 
into multiple distinct infectious forms, one for each different strain (Aguzzi 
and Haass, 2003; Collinge, 2001; Liebman, 2002; Prusiner et ah, 1998). 
Several studies have found correlations between strain phenotypes and con- 
formations of prion particles (Bessen et ah, 1995; Chien and Weissman, 
2001; Chien et al, 2003; Peretz et al, 2002; Telling et al, 1996). Nonethe- 
less, whether such differences cause or are simply a secondary manifestation 
of prion strains had remained unclear, largely due to the difficulty in 
creating an infectious protein from an in vitro source and introducing it 
into organisms (Aguzzi and Haass, 2003; Liebman, 2002). 

An earlier study demonstrated that introduction of bacterially produced 
pure Sup-NM by liposome fusion induces the [PSi ] state (Sparrer et al, 
2000). However, Sup-NM amyloid formation occurred after encapsulation 
within liposomes, making it difficult to control for or evaluate the confor- 
mation of the infectious form. Therefore, this liposome-based infection 
strategy was poorly suited to test the role of prion conformations in strain 
diversity. To overcome this limitation, we developed a novel transforma- 
tion protocol by which in vitro preformed Sup-NM amyloid fibers or in vivo 
Sup35p prions are introduced into yeast spheroplasts with high efficiency. 
Here, we describe protocols to prepare different Sup-NM amyloid forms 
and in vivo prion particles as well as to efficiently introduce them into yeast. 




2. Purification of Bacterially Expressed 
Sup-NM 

A plasmid of Sup-NM containing C-terminal polyhistidine tags under 
control of a T7 promoter (pAED4-Sup-NM) is transformed in Escherichia 
coli BL21 (DE3). The cells are grown at 37 °C until OD reaches 0.5 and the 
protein was expressed with 0.4 mM isopropyl-j6-thiogalactoside for 4 h at 
37 °C. Sup-NM was purified in a denatured condition by Ni agarose and 
cation-exchange columns, as follows (DePace et al, 1998; Glover et al, 
1997). 



2.1. Purification of Sup-NM 

1. Add 25 ml buffer A (8 Murea, 25 mMTris, 300 mMNaCl, pH 7.8) to 
bacterial cells harvested from 1 1 LB media. Vortex vigorously to resus- 
pend pellets. Sonicate the suspension (Sonic Dismembrator, Fischer 
Scientific, 40% intensity, tip diameter 3 mm) until the cells are fully 
disrupted (approximately 30 s). 
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2. Incubate the protein solution with gentle agitation for 1 h at room 
temperature. Spin at 15,000 rpm (Sorvall SS-34 rotor) for 30 min. 
Samples are prefiltered (Millex AP 20, Millipore), followed by filtration 
with 0.45 /im filters. 

3. Pour the protein solution onto a Ni-NTA agarose column (~2 ml 
agarose gels per liter of cell culture) preequilibrated with buffer A. 
Wash the column with 4x column volumes of buffer B (8 M urea, 
25mMTris, pH 7.8). 

4. Elute bound proteins with buffer C (8 M urea, 25 mM Tris, pH 4.5). 
Collect 3—4 ml fractions and run them onto SDS— PAGE. Pool fractions 
with highest abundance of Sup-NM and store at —80 °C. 

5. Load the partially purified Sup-NM from the Ni-NTA column onto a 
6-ml Resource S column (GE) preequilibrated with buffer D (8 M urea, 
50mMMES, pH 6.0). 

6. Elute Sup-NM with a 0-200-mM NaCl gradient in buffer D. Typically, 
we monitor absorbances at 229 and 280 nm, and collect 1 ml fractions. 

7. Analyze purity of the fractions by SDS— PAGE. Pool fractions that 
contain >90% pure Sup-NM. Concentrate the fractions, exchange 
buffer D with 6 M guanidine hydrochloride (Gdn) containing 5 mM 
potassium phosphate (pH 7.4) and concentrate it again generally to more 
than 1 mM, using a VIVASPIN concentrator (Sartorius AG). Filter the 
concentrated protein with Microcon YM-100 (Millipore). Divide 
the filtered protein into ~10 jA aliquots and store them at —80 °C 
until use. 




3. Preparation of Different Conformations of 
In Vitro Sup-NM Amyloid 

Sup-NM has the ability to misfold into multiple different amyloid 
conformations (DePace and Weissman, 2002). We found that the simplest 
method for controlling the conformation of Sup-NM amyloids is to alter 
the temperature at which the polymerization occurs. For preparation of 4 or 
37 °C Sup-NM amyloid, guanidine hydrochloride-denatured Sup-NM was 
diluted by more than 200-fold into 5 mM potassium phosphate buffer (pH 
7.4) containing 150 mMNaCl at 4 or 37 °C, respectively. It is important to 
use at least a 200-fold dilution of the Sup-NM stocks and preincubate the 
polymerization buffer at the proper temperatures to robustly obtain the 4 
and 37 °C fibers. The Sup-NM solution (typically 2.5—5 fiM) is immedi- 
ately rotated at 8 rpm in an end-over-end manner overnight. Conforma- 
tional differences in the two Sup-NM amyloid fibers were assessed by the 
following melting temperature analysis. 
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3.1. Melting temperature analysis of Sup-NM amyloid 

1. Generate Sup-NM amyloid fibers at 4 or 37 °C. Add 6x SDS sample 
buffer (final 1.7% SDS) to the fiber solution and aliquot 20 jA into 500 jA 
PCR tubes. 

2. Incubate the fiber solution at increasing temperatures from 25 to 95 °C 
in 10 °C intervals and 100 °C for 5 min in a PCR thermal cycler. 
Transfer each tube incubated at a specific temperature into a water bath 
(~ 10 °C) in order to cool the tubes quickly. 

3. Run samples onto SDS— PAGE. Probe thermally solubilized Sup-NM 
monomer by western blotting with a polyclonal Sup-NM antibody 
(Santoso et ah, 2000), followed by detection with chemiluminescence. 

4. Quantitate band intensities by western blot with ImageJ (NIH) and fit 
them as a function of temperature with IgorPro (WaveMetrics Inc.), 
using the following equation: y = ^4 + B/(l + 10 ( ~ x * ), where x, 
y, A, B, C, D indicate temperature, band intensity, band intensity at 
baseline, amplitude of band intensity, melting temperature (T m ), and 
width of the melting transition (W), respectively. 4 °C fibers showed 
T m = 56 =b 2 °C and W — 27 =b 2 °C, compared with values for 37 °C 
fibers of T m = 77 ± 2 °C and W = 14 ± 1 °C. 



m 




4. Preparation of In Vivo Prions from Yeast 

For preparation of crude yeast extracts, yeast cells were spheroplasted 
with lyticase (see Section 6 for preparation of spheroplasts) or lysed with 
glass beads in the presence of a protease inhibitor cocktail (Roche), and 
sonicated on ice for 10 s (Sonic Dismembrator, Fischer Scientific, 20% 
intensity, tip diameter 3 mm) before use. 

1 . For preparation of partially purified prion particles, spheroplast yeast cells 
with lyticase (^250 fig for yeast cells cultured from 50 ml YPD (1% 
yeast extract, 2% bactopeptone, 2% dextrose)) in SCE-buffer (1 M 
sorbitol, 10 mMEDTA, 10 mMDTT, 100 mM sodium citrate, pH 5.8) 
containing protease inhibitors (protease inhibitor cocktail, Roche) (see the 
following paragraph for preparation of lyticase). 

2. Lyse the spheroplasts by sequential addition of sodium deoxycholate to 
0.5% (w/v) and Brij-58 to 0.5% (w/v) after 5 min (Uptain et al, 2001). 
After incubation on ice for 15 min, spin the lysate at 10,000 Xg for 5 min 
at 4 °C and ultracentrifuge the supernatant at 100,000 Xg for 30 min 
(TLA100.3 rotor, Beckman). 

3. Resuspend the pellet with 1 M lithium acetate, incubate it on ice for 
30 min with gentle agitation and spin again at 100,000 Xg for 30 min. 
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Resuspend the pellet with 5 mM potassium phosphate buffer including 
150 mM NaCl and sonicate it on ice for 10 s (Sonic Dismembrator, 
Fischer Scientific). Determine concentration of total protein in the yeast 
extracts and partially purified prion particles by Bradford or BCA assay, 
using BSA as a standard. 




5. Preparation of Lyticase 

1. Transform E. coli. with pUV5-lyticase (Scott and Schekman, 1980), a 
bacterial plasmid that expresses periplasmically localized lyticase under 
control of T7 promoter into E. coli. BL21(DE3). 

2. Inoculate a single colony into 100 ml of LB media including 100 /ig/ml 
ampicillin and culture it at 37 °C until OD 600 reaches ~0.5. Transfer 
20 ml of the culture media to 1 1 of LB media and culture again at 37 °C. 

3. Add isopropyl-jS-thiogalactoside (final concentration of 0.5 mM) when 
OD 60 o reaches ~0.5 to initiate protein expression. After 3 h, collect cells 
at 5000 rpm (Sorvall SLA-3000 rotor) for 20 min. 

4. Resuspend cells in 20 ml of 25 mM Tris— HC1 (pH 7.4), incubate at 
room temperature with gentle agitation for 30 min using a nutator and 
centrifuge at 7500 rpm (Sorvall SS-34 rotor) for 10 min. 

5. Resuspend the pellet in 20 ml of 5 mM MgCl 2 , incubate at 4 °C for 
30 min and spin at 15,000 rpm for 30 min (Sorvall SS-34 rotor) to 
separate periplasmic components. 

6. Dialyze the supernatant against 2 1 of 50 mM sodium citrate buffer (pH 
5.8) for 3 h twice, ultracentrifuge the solution at 100,000 Xg for 30 min 
(TLA100.3 rotor, Beckman) and store the supernatant at —80 °C. 
Determine the concentration of lyticase by the Bradford method. 




6. Protein Transformation 

Throughout, we used isogenic [psi~] and [PSIj derivatives of 74D- 
694 [MATa, his3, leu2, trpl, ura3; suppressible marker adel-14(\JGA)] 
(Santoso et al, 2000). Similar results have been obtained in a W303 back- 
ground. In contrast to de novo prion induction by Sup-NM overexpression, 
the infection efficiency of protein transformation does not depend on the 
[PIN\ state of yeast (Tanaka et ah, 2004). The presence of a URA3 plasmid 
allows one to preselect for yeast that have successfully taken up material 
from the solution but is not absolutely required (Fig. 28.1 A). 
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Figure 28.1 Induction of the [PSi^] prion state by in vitro pure Sup-NM amyloids. 
(A) Schematic of transformation procedure. The [P&Z^] status is assessed by plating 
spheroplast mixture on SD-Ura containing trace amounts of adenine or on SD-Ura 
plates, followed by streaking transformants onto YPD plates. (B) Examples of yeasts 
transformed with the indicated materials on SD-Ura plates. Large and white colonies 
are yeasts that are converted to [PSi^] prion states. (C) Concentration-dependent and 
[P/AT| state-independent infectivity by Sup-NM amyloid. The indicated concentration 
of fibrillar (filled circle) or soluble (open circle) Sup-NM was transformed into isogenic 
[psC^PIN^] (black line) or [psf~][pm - ] (grey line) strains. Throughout, values with 
error bars are expressed as mean ± S.D. 
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1. Grow yeast cells in 50 ml of YPD media to an OD 600 of 0.5 and 
successively wash with 20 ml of sterile H 2 0, 1 M sorbitol and SCE- 
buffer (1 M sorbitol, 10 mM EDTA, 10 mM DTT, 100 mM sodium 
citrate, pH 5.8). 

2. Spheroplast cells with lyticase (~250 fig for yeast cells cultured from 
50 ml YPD) in SCE -buffer at 30 °C for 30 min. Commercially available 
lyticase (Sigma, L-5263) is also used to prepare spheroplasts (King and 
Diaz-Avalos, 2004). Spheroplasts are then centrifuged (400 x^, 5 min) 
and successively wash with 20 ml of 1 M sorbitol and STC-buffer (1 M 
sorbitol, 10 mM CaCl 2 , 10 mMTris, pH 7.5). 

3. Resuspend pelleted spheroplasts with 1 ml of STC-buffer and 100 fA of 
the spheroplast was mixed with sonicated Sup-NM amyloid fibers 
(2.5-10 fiM) or in vivo prions (200-400 /ig/ml), URA3 marked plasmid 
(pRS316) (20 /ig/ml) and salmon sperm DNA (100 /ig/ml). Incubate 
mixture for 30 min at room temperature. 

4. Induce fusion by addition of 9 volumes of PEG-buffer (20% (w/v) PEG 
8000, 10 mM CaCl 2 , 10 mM Tris, pH 7.5) at room temperature for 
30 min. 

5. Collect the cells (400 X g, 5 min), resuspend with 150 fA of SOS-buffer 
(1 M sorbitol, 7 mM CaCl 2 , 0.25% yeast extract, 0.5% bactopeptone), 
incubate at 30 °C for 30 min and plate on synthetic media lacking uracil 
overlaid with top agar (2.5% agar). Adenine (20 mg/1) is absent (proce- 
dure [1]) or present (procedure [2]) in the agar plate. 

6. Incubate the plates at 30 °C for ~10 days (procedure [1]) or 4—6 days 
(procedure [2]). For procedure [2], after the incubation, streak single 
colonies randomly chosen from the SD-URA plates onto modified 
YPD plates containing 1/4 of the standard amount of yeast extract 
(0.25%) to enhance color phenotypes of [PSr~] and \psi~] states. Include 
streaks of strong [PSIj and \psi~\ controls on each plate for comparison. 




7. Determination of Prion Conversion 
Efficiency and Prion Strain Phenotypes 

For procedure [1], \psi~] yeast cells form small and intensely red 
colonies, whereas [P&T] yeast form large white colonies (Fig. 28. IB). 
Transformation of \psi~\ yeast with plasmid alone or with soluble Sup- 
NM does not lead to detectable formation of [P5i^] colonies, whereas 
inclusion of preformed Sup-NM amyloid seeds leads to significant produc- 
tion of large, white Ade colonies. The Ade colonies exhibited the hall- 
marks of the [PSIj prion: the Ade trait was inherited in a non-Mendelian 
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manner, was readily cured by transient growth on medium containing 
5 mM guanidine hydrochloride, and was associated with the formation of 
large Sup35p aggregates that are readily pelletable by high-speed centrifu- 
gation. We calculated a fraction of Ade colonies from more than 200 total 
colonies on at least three different SD-URA plates containing trace amounts 
of adenine. Increasing the concentration of Sup-NM fibers resulted in a 
dose-dependent increase in Ade convertants, with the fraction of Ade 
colonies among the Ura colonies approaching 100% at high Sup-NM 
concentrations (Fig. 28.1C). As expected for a prion, the efficiency of prion 
conversion was sensitive to proteinase but not nuclease treatment (Tanaka 
etal, 2004). 

For procedure [2] , the efficiency of conversion to prion state as well as 
the phenotypic strength of prion strains was examined by monitoring color 
with modified 1/4 YPD plates. Transformants remained \psi~\ (red) or 
converted to a weak [PSi"^] (pink) or strong [P&r ] (white) state 
(Fig. 28. 2A and B). After the YPD plates were incubated at 30 °C for a 
few days, each streak was classified into strong [P&r ] (white), weak (pink) 
[PSr~], or [psi~] (red) strains. In quantification experiments, typically 56 
colonies from at least three independent transformations are streaked on 
YPD plates. Ade revertants, which are rare, were readily excluded from 
the statistics, as they showed brown color on 1/4 YPD plates. 

Fibers formed at 4 °C had a high efficiency of infection as shown by the 
large majority of colonies with a strong (white) [PSF~] strain phenotype. 
However, fibers formed at 37 °C had lower infectivity and produced almost 
exclusively weak (pink and/or sectored) [P&f ] strains (Fig. 28. 3A and B). 
The weak strains showed increased levels of soluble Sup35p and were more 
readily cured by Hspl04 overexpression (Tanaka et al, 2004). The strain 
phenotypes did not depend on the concentration of seed or the infection 
efficiency: 4 °C fibers yielded strong strains even when diluted 10-fold 
(infection rate ~20%) and 37 °C fibers yielded weak strains even when the 
concentration was increased 10-fold (infection rate ~80%) (Fig. 28. 3C). 
Thus, the [PSP - ] strain is determined by conformation of infectious Sup- 
NM amyloid. These results establish that amyloid is an infectious form of 
prion protein and that conformational differences in amyloid determine 
prion strain variations. 

In summary, we developed an efficient protein transformation protocol, 
which allows one to introduce in vitro amyloid or in vivo prions efficiently 
into yeast. Using the technique, we directly demonstrated that conforma- 
tional differences of amyloid fibers constitute the physical foundation of the 
heritable differences in prion strains. Importantly, any protein and protein 
aggregate can be efficiently introduced into yeast by this procedure. Thus, 
this novel and versatile protein transformation technique will be a powerful 
tool in yeast biology. 
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Figure 28.2 Generation of multiple [PSr~ ] strains by in vitro Sup-NM amyloid fibers 
and in vivo prions. (A) Infection of yeast without selection for the prion state. Following 
transformation with the indicated material, cells were recovered on SD-Ura (top) and 
randomly selected colonies were streaked on 1/4 YPD plates to identify [PSi^] con- 
vertants (bottom). In vitro-convevted amyloid fibers induced a range of white to pink 
(grey color) [PSi^] strains. Throughout, the top (+) and bottom (— ) streaks in a 1/4 
YPD plate are strong [PSi^] and [/m'~] controls, respectively. (B) Induction of prion 
state by partially purified prion particles derived from strong (white) or weak (grey) 
[PSi^] strains. Note that successful [P&r ] infectants (bottom) show the same strain 
phenotypes as the donor [PSi^] strain (top). 
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Figure 28.3 Generation of distinct [PSi^] strains by different conformations of in vitro 
Sup-NM amyloids. (A) Examples of [PSi^] strains resulting from infection with Sup- 
NM amyloid spontaneously formed at 4 or 37 °C. White and pink (grey color) and/or 
sectored colonies are strong and weak [PSi^] variants, respectively. (B) Quantification 
of frequency of [PSi^] strains induced by transformation with 4 or 37 °C Sup-NM 
amyloids. White and grey bars show fractions of strong and weak [PSi^] strains 
pheno types. (C) Induction of [PSi^] states using 1/1 0th the amount of 4 °C Sup-NM 
amyloid or 10-fold the amount of 37 °C Sup-NM amyloid used in (A). 
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Abstract 

The budding yeast Saccharomyces cerevisiae is a viable system for the over- 
expression and functional analysis of eukaryotic integral membrane proteins 
(IMPs). In this chapter we describe a general protocol for the initial cloning, 
transformation, overexpression, and subsequent purification of a putative IMP 
and discuss critical optimization steps and approaches. Since expression and 
purification are often the two predominant hurdles one will face in studying this 
difficult class of biological macromolecules the intent is to outline the general 
workflow while providing insights based upon our collective experience. These 
insights should facilitate tailoring of the outlined protocol to individual IMPs 
and expression or purification routines. 
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1. Introduction 

Obtaining sufficient quantities of a purified integral membrane protein 
(IMP) for downstream experiments, such as structural or functional analysis, 
can be a daunting task. Common hurdles that one may encounter include 
obtaining sufficient IMP overexpression, extracting the IMP from cellular 
membranes with a detergent and purifying the IMP in functional form. 
Advances in addressing these bottlenecks should facilitate efforts by the 
broader scientific community in pursuing their own particular IMP of 
interest. One such advance is use of the budding yeast Saccharomyces cerevisiae 
to overexpress IMPs (Bill, 2001; Bonander et al., 2005; Griffith et al., 2003; 
Hays et al., 2009; Li et al., 2009; White et al., 2007). When combined with a 
broad range of methods for in vivo functional characterization of IMPs in 
yeast, with its exhaustive genetic toolkit, one can appreciate the inherent 
power of using 5. cerevisiae as an expression system. Thus, the objective of 
this chapter is to provide a general approach for overexpression of IMPs in 
the yeast S. cerevisiae. In addition, we will provide an introduction to 
purifying the IMP of interest following expression. To accomplish this, 
we will describe our approach to the task while highlighting critical steps 
within the protocol that may require heightened attention. It is important to 
note that overexpression and purification of functional IMPs is still a 
laborious endeavor fraught with problems. As with most difficult journeys, 
many small decisions often come together in dictating the outcome. 




2. General Considerations 

5. cerevisiae is an intensely studied eukaryotic organism. The approach 
we have taken with the current chapter is to outline the yeast expression 
protocol currently deployed within our research efforts. At almost every 
step throughout this chapter, an alternative method, vector, column, buffer, 
affinity tag, etc., could be employed with possibly better outcomes for the 
specific protein being studied. Our intent is to convey a generic strategy 
and, where possible, highlight alternatives that we feel the reader should be 
aware of. Since working with IMPs is often an endeavor replete with 
nuances and hurdles, it is our desire that this chapter will provide a founda- 
tion for those not familiar with membrane protein overexpression and 
purification. 

This chapter is organized around the expression and purification of a 
putative integral membrane protein termed "POI" for "protein of inter- 
est." The intent is that a reader can substitute the membrane protein he/she 
is interested in for this target. As with most procedures, our approach is not 
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the only viable strategy. It works very well in many cases though modifica- 
tions can be customized to suit the system under study. We have made some 
key choices based upon our prior experience including: (1) yeast strain 
W303-Apep4 (leu2-3,112 trpl-1 canl-100 ura3-l ade2-l his3-ll,15 Apep4 
MATa) is used, (2) pRS423-GALl -based inducible plasmid to drive 
expression, (3) a C-terminal [linker] -[3C-pro tease] -[1 Ox His] tag fused to 
the expressed protein, and (4) solubilization in the detergent n-dodecyl-/J- 
D-maltopyranoside (DDM). Each of these is a critical step and should be 
examined if the described procedure should fail. Finally, previously pub- 
lished protocols may also be of interest to the reader (Hays et ah, 2009; Li 
et ah, 2009; Newby et ah, 2009). 




3. Protocol— Molecular Biology 

Our experience is such that multiple expression plasmids, affinity tags, 
and fusion constructs should be tried when pursuing a specific protein. Vectors 
designed to better leverage S. cerevisiae as an expression system are often 
chimeric shuttle vectors with yeast and bacterial derived sequences. The 
yeast contribution to the vector sequence will determine the location of 
transformation: extrachromosomal ectopic expression or chromosomal inte- 
gration in mitotically stable yeast strains (Boer et ah, 2007). Episomal expres- 
sion requires that the cloned gene needs be free of introns for plasmid or 
genomic expression. If properly implemented, episomal overexpression in 
yeast can be rapidly deployed and often yields milligram quantities of IMPs 
(Li et ah, 2009; Mumberg et ah, 1995). This is accomplished through the 
autonomously replicating sequence from native yeast 2/i plasmid (Christianson 
et ah, 1992). Thus, for the current example, POI is cloned into a high-copy 
2/i episomal expression vector containing a GAL1 promoter (Fig. 29.1). The 
GAL1 promoter is useful because it is tightly repressed in the presence of 
glucose and strongly induced by galactose allowing for stringent control of 
protein expression. Expression levels can be further manipulated by altering 
the copy number through the origin of replication and by swapping out the 
GAL1 promoter for constitutive (ADH1, TEF2) or other inducible promoters 
(e.g., MET25, PH05) (Mumberg etah , 1994). In our experience, constitutive 
promoters are not effective for IMP overexpression. 

To facilitate the process of shuttling a gene between numerous expres- 
sion vectors, and even expression systems, we use ligation independent 
cloning (LIC). We previously described in detail how LIC cloning is 
performed within our yeast system (Supplementary Information in Li 
et ah, 2009). Our experience has led us to prefer a C-terminal rhinovirus 
3C cleavable poly-histidine tag as an initial choice when pursuing novel 
IMPs. Approximately 30% of IMPs contain an N-terminal signal peptide 
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LIC cassette 

Figure 29.1 Schematic of the p423-GALl expression plasmid. Yeast 2/i-based expres- 
sion plasmid containing a GSS-3C protease-lOx His tag for the protein of interest using 
LIC. Plasmid contains yeast (shown in red), bacterial (shown in green), and phage (shown 
in blue) elements conducive to molecular cloning and transformation methodologies. 



involved in proper protein maturation and targeting to cellular membranes. 
Since N-terminal tags can interfere with this processing, the preference is to 
use C-terminal tags when available. In addition, C-terminal tags provide 
greater assurance that the protein being purified through initial steps is the 
full-length construct and free of truncation or degradation. If the POI does 
not contain a signal peptide, which is often difficult to ascertain for eukary- 
otic genes, then N-terminal tags provide greater flexibility in developing 
expression constructs. Whatever tag is chosen, care should be taken to 
ensure that it is either added to the design of synthetic primers during 
cloning or already present within the selected plasmid. Also, a critical step 
when including C-terminal tags is to ensure that the native stop codon is 
removed from the gene of interest. For the current discussion, we will clone 
POI into our p423-GALl expression plasmid containing the following 
design: Start- [POI] -[linker]- [3 C site] -[1 Ox His] -Stop. The choice of 
using a rhino virus 3C protease for tag cleavage is described later. 

A general protocol for cloning POI into this plasmid is as follows. Refer 
to Li et al. (2009) for a detailed protocol: 

1 . POI is PCR amplified with primers palindromic to our p423-GALl LIC 
vector. 

2. Amplified POI and p423-GALl separately undergo T4-polymerase 
3 7 — 5' exonuclease digestion in the presence of dATP and dTTP, 
respectively. 

3. The digested gene and plasmid are then combined at room temperature, 
annealed, and transformed into competent Escherichia coli cells. 
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4. Colony PCR is used to confirm POI insertion into p423-GALl. Sequenc- 
ing the plasmid with GAL1 and CYC1 primers validates POI identity. 




4. Protocol— Cell Growth 

The plasmid containing POI destined for transcription, translation, 
and proper membrane insertion must first be introduced into the yeast host 
through transformation. Although there are several methods to introduce 
genetic material into 5. cerevisiae, including Agarobacterium tumefaciens- 
mediated transformation (Piers et ah, 1996), we use a lithium acetate 
transformation protocol with PEG 3350. The episomal vector p423- 
GAL1 contains the HIS3 gene needed by our strain, W303-Apep4 
(Ieu2-3,U2 trpl-1 canl-100 ura3-l ade2-l his3-ll,15 Apep4 MATa), and 
must be cultured in synthetic complete media without histidine to maintain 
selection for the plasmid containing POI. Cultures are grown in 375 ml 
volumes containing SC-His with 2% glucose in 1L baffled flasks shaking at 
220 rpm at 30 °C. Following a growth period of 24 h, the optical density at 
600 nm ranges between 15 and 20 for most cultures with glucose concen- 
tration generally <C0.1%. The culture is induced by adding 125 ml of 4 X 
YPG (yeast extract, bactopeptone, and galactose) to each flask bringing the 
final volume to 500 ml. DMSO has previously been shown to improve the 
expression of certain IMPs and may be tried as a growth additive during 
induction (Andre et ah, 2006). Growths can easily transition from shaker 
flasks to the zymurgy route of large-scale fermentation as the choice of 
inducible promoter enables careful regulation and timing of expression. 
Cells are harvested after 16 h at 6000 x^ and resuspended in 30 ml of lysis 
buffer containing 50 mM Tris (pH 7.4, RT), 500 mM NaCl, and 20% 
glycerol (v/v) per half liter of growth. Ideally, we adjust the volume of 
growth culture to obtain a minimum of 2— 3 mg purified protein per growth 
(200-300 fil at 10 mg/ml). 




5. Protocol— Membrane Preparation and 
Solubilization 

Once harvested, cells expressing POI are lysed mechanically using a 
microfluidizer or by bead beating in the presence of protease inhibitors. For 
bead beating, we use a 90-ml canister containing approximately 40 ml of 
resuspended cell pellet. Each canister is then filled to the top with 0.5 mm 
prechilled glass beads and lysed using four cycles of 1 min "on" and 1 min 
"off." We find that aggressive protease inhibition is not always needed and 
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is target specific. Crude cell lysate is then centrifuged at 6000 x^ for 15 min. 
After this centrifugation, qualitative lysis efficiency can be determined by 
the debris pellet which will contain two layers: a bottom pink (strain 
dependent) layer of unlysed cells and top lighter layer of organelles and 
cellular debris from lysed cells. The ratio of the top lysed cells to the bottom 
unlysed cells is a qualitative indicator of lysis efficiency. We typically have 
>90% efficiency at this stage but >70% is considered acceptable. Collect 
cell lysate from the supernatant of the previous low-speed spin while being 
careful not to contaminate supernatant with cell debris, and spin the super- 
natant at 138,000 Xg (42,000 rpm using a Ti 45 rotor) for 2 h. Discard the 
supernatant from the high-speed spin. Occasionally, a loose upper layer is 
obtained following the high-speed spin that should be retained as it often 
contains a predominant portion of the expressed protein. Resuspend mem- 
branes in approximately 5 ml of membrane resuspension buffer (50 mMTris 
(pH 7.4, RT), 200 mMNaCl, 10% (v/v) glycerol, and 2 mM fresh PMSF) 
per liter of culture growth with 10 fA HALT protease inhibitor cocktail (or 
your protease inhibitor cocktail of choice). Stir on ice for 30 min and flash 
freeze membranes in LN2 or use immediately. 

We commonly use the following detergents for solubilizing membrane 
proteins leading to structural work: n-octyl-jS-D-glucopyranoside (OG), 
n-nonyl-jS-D-glucopyranoside (NG), n-decyl-j6-D-maltopyranoside (DM), 
n-dodecyl-j6-D-maltopyranoside (DDM) , n-dodecyl-N,N-dimethylamine- 
N-oxide (LDAO), and n-dodecylphosphocholine (FC-12). Detergents are 
purchased in high-purity form (i.e., "ANAGRADE") from Anatrace. 
Numerous other detergents are possible depending on the individual exper- 
iment being performed. Once the POI-3C-10His has been expressed, it is 
important to access how well it can be extracted from the membrane with a 
detergent. This is generally accomplished through a broad screen of several 
detergents. The recommended concentrations, for detergents listed above, 
when solubilizing cellular membranes are 270 mM OG, 140 mM NG, 
10 mM DM, 20 mM DDM, 200 mM LDAO, and 20 mM FC-12 (10 x 
CMC for other detergents is a recommended starting point). A detailed 
protocol for performing this step is available in Box 1 of Newby et ah 
(2009). Generally, small aliquots of cellular membranes are mixed with an 
equal volume of buffer containing detergent at the above concentration and 
then stirred at 4 °C for 12—14 h. Unsolubilized cellular membranes will 
pellet at 200,000 Xg, so the extent to which a given detergent is able to 
solubilize POI-3C-10His can be evaluated by the amount of protein left in 
the supernatant following a high-speed spin. When evaluating initial 
expression levels via western blots, one may observe several background 
bands specific to yeast that may be visible in an epitope dependent manner. 
For anti-His westerns IST2, a 946 amino acid polypeptide containing a 
stretch of seven histi dines near the C-terminus, runs at around 100 kDa. 
When using anti-FLAG, an unidentified contaminant band often appears 
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around 60 kDa. An HRP-conjugated Penta-His antibody (Invitrogen) 
works best for probing C-terminal poly-histidine tags in our experience. 
If available, functional assays to verify activity following detergent solubili- 
zation are highly informative. 




6. Protocol— Protein Purification 

Once the POI-3C-10His protein is extracted from cellular mem- 
branes in soluble form, it may be purified to obtain a sample that is Pure 
(free of other proteins and contaminants), Homogenous (single uniform 
population), Stable (typically over a week in concentrated form at 4 °C), 
and Free of protein-free detergent micelles (this combined state will be 
referred to as "PHSF"). To accomplish this, we employ a narrow range of 
techniques including immobilized metal affinity (IMAC), size-exclusion 
(SEC) and ion-exchange chromatography. These methods are synergistic, 
iterative, and employed to varying degrees depending on the target protein. 
For the current discussion, we will detail a standard approach of IMAC 
followed by cleavage of the expression tag, reverse- IMAC to remove 
uncleaved protein, and finally SEC to obtain the purified protein in diluted 
form. This sample will then be concentrated and analyzed prior to use. If 
this sample is intended for structure determination (i.e., crystallization) then 
special caution should be taken to avoid a significant concentration of 
protein-free detergent micelles (Newby et ah, 2009). 

Thus, we will continue with the theme of purifying our target protein, 
POI-3C-10His, which was solubilized in the previous section. Recom- 
mended detergent concentrations for SEC buffers are as follows: 40 mM 
OG, 12 mM NG, 4 mM DM, 1 mM DDM, 12 mM LDAO, and 4 mM 
FC-12 (2x CMC is a good starting point for most detergents). For the 
current example, we will use 1 mM DDM in all buffers (as determined in 
Section 5). The initial step to protein purification is a metal-affinity purifi- 
cation of the solubilized membranes; we generally use 125 fA of Ni-NTA 
agarose resin (Qiagen) per mg of expected protein yield. The selected 
IMAC resin should be prepared according to manufacturer's specifications 
and optimized as needed. The solubilized membranes should be incubated 
with IMAC resin at 4 °C with nutation for at least 1 h though generally not 
longer than 3 h. We have found that the degree of target protein binding to 
Ni-NTA resin does not increase substantially past 3 h though increased 
proteolysis and binding of contaminant proteins may occur. Following 
incubation, the Ni-NTA resin containing bound POI-3C-10xHis protein 
should be transferred to a gravity flow column and washed with 20 column 
volumes of Buffer A (20 mMTris (pH 7.4, RT), 200 mMNaCl, 10% (v/v) 
glycerol, 4 mM/J-ME, 1 mMPMSF, and 1 mMDDM) containing 10 mM 
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imidazole. If following the wash by absorbance, it is beneficial to wash until 
y4 2 8o nm returns to baseline. It is important at this point to obtain about 
10 jA of initial flow- through for SDS— PAGE analysis. The above steps are 
repeated with Wash 2 (Buffer A with 25 mM imidazole) and Wash 3 (Buffer 
A with 40 mM imidazole) buffers. Finally, POI-3C-10xHis is eluted from 
the column using the IMAC elution buffer (Buffer A with 300 mM imid- 
azole). If possible, reduce the flow rate prior to elution to ensure the target 
protein elutes in a minimal volume. Be careful to observe the eluted sample 
for turbidity, especially over the ensuing several minutes as the protein may 
be unstable in the prescribed buffer and thus precipitate out of solution at 
this point. If precipitation occurs, one can make appropriate changes to the 
IMAC buffers (e.g., changing salt concentration or pH) to increase protein 
stability. It is also advisable to perform a buffer exchange immediately 
following elution into 20 mM HEPES (pH 7.4), 150 mM NaCl, 10% 
(v/v) glycerol, 4 mMjg-ME, 1 mMPMSF, and 1 mMDDM (SEC buffer). 
This can be accomplished with a small desalting column such as the Econo- 
Pac 10 DG disposable chromatography column from Bio-Rad (cat. no. 
732-2010). Following IMAC and buffer exchange, the POI-3C-His pro- 
tein is ready for cleavage of the linker-3C-10xHis expression tag. 

There are a broad number of site-specific proteases for cleaving affinity 
tags, though care should be taken to ensure they are active in the prescribed 
detergent (Mohanty et ah, 2003). The human rhino virus 3C protease and 
thrombin are both robust and efficient proteases that have worked very well 
for cleaving affinity tags attached to detergent solubilized membrane pro- 
teins. We have had great success with an MBP-3C fusion construct 
described previously (Alexandrov et ah, 2001). To cleave the POI-3C-His 
affinity tag, the protein should be incubated overnight at 4 °C with approx- 
imately a 1:5 ratio of protease to target protein in whatever volume of buffer 
is obtained in the desalting step above. Retain pre- and postcleavage 10 jA 
samples for SDS— PAGE gel analysis to evaluate cleavage. Following cleav- 
age a reverse-IMAC purification (i.e., the flow- through is retained) is 
performed using metal-affinity resin to separate cleaved 3C-His tag and 
protease (which is also His tagged) from the target protein. This step entails a 
lh incubation with IMAC resin, such as "Talon" metal-affinity resin, in 
batch at 4 °C. Following incubation, the flow- through should be 
retained — this contains the cleaved POI protein that will be purified in 
the next step. Elute resin bound protein from the column using the IMAC 
elution buffer and collect a 10- /A sample for analysis on a gel to ascertain if 
nonspecific binding of the target protein is occurring. Following comple- 
tion of this step, the Ni-purified, 3C-cleaved POI protein is now ready for 
further purification. 

Ion-exchange chromatography is a powerful technique that separates 
macromolecules based upon charge state at a given pH. Though not 
discussed within this chapter, we have often used this technique to purify 
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difficult targets, concentrate protein, reduce protein-free detergent micelles, 
perform detergent exchanges, or obtain a pH stability profile. We generally 
use 1 or 5 ml disposable HiTrap sepharose Q or SP ion-exchange columns 
from GE Healthcare. Though not performed on every membrane protein, 
ion-exchange chromatography has proven to be a valuable technique and 
should be leveraged when needed. 

The collective experience from the numerous IMP purifications that we 
have performed is that SEC is an essential step in the process of obtaining a 
PHSF sample. SEC allows one to rapidly evaluate the quality of the purified 
protein by analyzing the retention time, shape, and number of eluted peaks 
from the sample. Elution in the void volume of a properly sized SEC 
column (i.e., the void volume is significantly higher than the expected 
molecular weight of the target protein— micelle complex) is indicative of 
protein that is not stable under the prescribed solution conditions. Often 
this means a new solubilization buffer should be used with optimized 
parameters for detergent selection, pH, and salt concentration. If the target 
is present within the included volume then careful analysis of the peaks 
should be performed. Is the POI resident within a single, Gaussian shaped, 
peak or multiple peaks indicative of several oligomeric states? If the latter, it 
may shift to the void over time and, either way, is often indicative of 
stability issues within the specified buffer. Ideally, one will see a single 
well-defined peak within the included volume corresponding to (and 
verified by gels/blots) the POI. For a detailed discussion of membrane 
protein SEC characteristics refer to Figure 3 of Newby et ah (2009). 
Coupling fluorescence with SEC, termed fluorescence-detection SEC, is 
another approach that requires very small amounts of expressed protein and 
is therefore conducive for broad screens (Kawate and Gouaux, 2006). 
Troubleshooting is often required during the SEC purification step to 
ascertain the correct buffer conditions for stabilizing the protein in solution 
within a monodisperse peak. A standard approach to this process is varying 
pH (e.g., 5.5 in MES, 7.0 in HEPES, and 8.0 in Tris), salt concentration 
(e.g., 25 mM, 250 mM, and 500 mM NaCl), presence or absence of 
osmolytes (e.g., adding varying concentrations of glycerol or sucrose), 
and addition of putative or known ligands. It is important to note that 
when approaching this step one should be systematic and linear to clearly 
differentiate effects on protein stability and homogeneity. 

In continuing with our example of expressing and purifying POI, we 
now have a Ni-purified and 3C-cleaved protein sample that has been 
purified away from cleaved affinity tag and protease. Next, we describe a 
general SEC purification step for this protein sample. There are a number of 
chromatography columns available and care should be taken to ensure that 
the column is appropriate for the desired task and will not interact with the 
detergent (e.g., TSK columns may interact with the detergent LDAO) or 
POI. We generally use a Superdex 200 10/300 GL column from GE 
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Healthcare (cat. no. 17-5175-01). This column has a separation molecular 
weight range of 10,000—600,000 that is ideally suited for most membrane 
proteins. The POI protein is now in the SEC buffer described above. It is 
important that the SEC column be equilibrated for a minimum of 3 h at 
0.5 ml/min or overnight at 0.1 ml/min to ensure complete equilibration 
with the detergent (DDM in our example). Once equilibrated, the POI 
sample can be run in iterative rounds with a peak height of approximately 
one absorbance unit at 280 nm. The amount of loaded sample will vary 
depending on the presence of contaminating or oligomeric peaks. Gener- 
ally, our approach is to use a chromatography station equipped with an 
auto-injector and fraction collector to enable automated runs, often over- 
night. Care should be taken that the column is not overloaded with sample, 
as this may mask secondary peaks and lead to incomplete purification. A 
common approach is to inject 0.5 ml of two OD A 2 so/ml sample per run, 
though the optimal injection amount is ultimately sample dependent. 

Some general considerations regarding the purification step should be 
highlighted. In particular, when working with solubilized membrane pro- 
teins, the actual identity of a sample is a membrane protein with a detergent 
micelle surrounding it. This micelle will often contain endogenous lipids 
from the expression host. First, the protein— detergent— lipid complex 
(PDLC) will likely have a shorter retention time relative to a soluble protein 
of the same mass. Thus, it can be hard to ascertain the oligomeric state of a 
PDLC based upon SEC retention time alone. This holds true when com- 
paring it to molecular weight standards because these standards are usually 
composed of small molecules and soluble proteins. In addition, IMPs tend 
to migrate slightly faster than expected on SDS— PAGE gels, giving the 
impression that your target is of a smaller mass than expected. Another 
common hurdle is that detergent micelles can occlude the protease recog- 
nition site when trying to remove an expression tag resulting in no, or 
attenuated, cleavage. Two common ways to avoid this potential problem 
are to add a short linker, often three additional amino acids, between 
the target protein and protease recognition site, or to move the expression 
tag to the other protein terminus. Moving the tag may lead to additional 
problems since approximately 30% of IMPs contain a signal peptide at the 
N-terminus. N-terminal tags can interfere with the processing of this signal 
peptide by the signal peptidase leading to retention of the membrane 
protein intracellularly and, as a result, decreased expression levels. Finally, 
when concentrating the purified protein, it should be remembered that the 
sample contains protein-free detergent micelles (since the detergent con- 
centration is above the detergent CMC). Since these micelles can impact 
biophysical properties of the sample, such as crystallization, it is important 
that detailed notes be maintained regarding the concentration factor 
(i.e., starting volume relative to final volume postconcentration) for the 
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sample. If possible, one should generally work to minimize the concentra- 
tion factor and thereby minimize the protein-free detergent micelles. 




7. Protocol— Protein Characterization 

Separating the protein from detergent micelles: The lack of absorbance at 
280 nm by detergent micelles means that to separate the protein containing 
micelles from those that do not requires other detection strategies. We have 
found that an in-line four- way detection scheme is useful in differentiating 
these species and separating them from each other. These detectors consist 
of UV absorbance and refractive index (RI) detectors for measuring con- 
centration, a differential pressure or intrinsic viscosity detector that indicates 
properties of size and shape, and a right angle light scattering detector that 
indicates molecular mass. In concert, these allow one to (1) optimize 
detergent micelle concentration while maintaining PDLC homogeneity 
of concentrated protein and (2) measure the PDLC oligomeric state 
(mass), size (Rh), shape (IV), detergent: protein ratio, and rate of change 
of RI (dn/dc). For common detergents, we have measured size-exclusion 
retention volume (SERV), dn/dc, micelle molecular weight, and retention 
behavior on different molecular weight cut-off filters for empty micelles in 
various systems. These micelle parameters are dependent on buffer compo- 
sition, column type, detergent concentration, and the presence of PDLCs. 
The goal is to minimize the detergent micelle concentration during purifi- 
cation and concentration. The micelle SERV relative to the PDLC SERV 
dictates whether the PDLC can be concentrated before SEC (as they often 
have different SERVs), and if SEC can be used to remove excess micelles 
after protein concentration. Detergent dn/dc is used to quantify excess 
[detergent micelle] after protein concentration, and the amount of deter- 
gent bound in the PDLC. To accurately measure the PDLC physical 
parameters, the PDLC peaks must be baseline-resolved, of adequate inten- 
sity, and Gaussian with no comigrating excess micelles or other buffer 
contaminants (i.e., single SEC peaks for all four detectors). A simpler 
approach is to include an in-line RI detector to measure solution viscosity. 
These detectors can be added to existing chromatography stations with 
minimal alterations and will identify SERV values for the specific solubili- 
zation detergent and buffer combination being used. Overall, characteriza- 
tion of the PDLC within the prescribed detergent using the above methods 
facilitates the development of a robust purification and protein concentra- 
tion scheme conducive to downstream endeavors. One should view PDLC 
and empty detergent micelles as separate entities during purification and 
work to identify the latter early within purification to minimize as needed. 
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8. Conclusion 

5. cerevisiae is a viable and powerful system for overexpressing IMPs as 
yeast is a genetically tractable and inexpensive expression system that can be 
easily manipulated experimentally and is conducive for high-, medium-, or 
low-throughput methodologies. Furthermore, being a eukaryotic organism 
it contains the necessary posttranslational modification and membrane tar- 
geting machinery to facilitate expression of many higher eukaryotic IMPs 
(Li et ah, 2009). The methods described in this chapter are focused on the 
overexpression and purification of a nominated IMP within yeast. 
Subsequent purification of these proteins can be accomplished if one takes 
appropriate caution and is aware of common hurdles. Whenever possible, 
functional assays should be incorporated into the purification protocol to 
ensure the POI being purified is in functional form. As the collective 
knowledge and experience in working with IMPs increases so have the 
rewards and novel biological insights. Indeed, the outlook is very positive 
(White, 2009). With so little known about the vast majority of IMPs, we are 
undoubtedly entering a period of dramatic growth in our understanding. 
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Abstract 

Protein aggregates are associated with a variety of debilitating human dis- 
eases, but they can have functional roles as well. Both pathological and non- 
pathological protein aggregates display tremendous diversity, with substantial 
differences in aggregate size, morphology, and structure. Among the different 
aggregation types, amyloids are particularly remarkable, because of their high 
degree of order and their ability to form self-perpetuating conformational 
states. Amyloids form the structural basis for a group of proteins called prions, 
which have the ability to generate new phenotypes by a simple switch in protein 
conformation that does not involve changes in the sequence of the DNA. 
Although protein aggregates are notoriously difficult to study, recent techno- 
logical developments and, in particular, the use of yeast prions as model 
systems, have been very instrumental in understanding fundamental aspects 
of aggregation. Here, we provide a range of biochemical, cell biological and 
yeast genetic methods that are currently used in our laboratory to study protein 
aggregation and the formation of amyloids and prions. 
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1. Introduction 

More than 40 years ago an unusual nonsense suppressor phenotype 
was reported that was inherited in a non-Mendelian manner (Cox, 1965). 
This phenotype, [PS/+], was later found to be caused by a change in the 
conformation of the translation termination factor Sup35p (Patino et ah, 
1996; Paushkin et ah, 1996). Shortly after the discovery of [PS/+], a 
different genetic element, [URE3], was isolated (Lacroute, 1971) and the 
causal agent was afterward identified as a conformationally altered form of 
the nitrogen catabolite repressor Ure2p (Wickner, 1994). Ure2p and 
Sup35p are the founding members of an intriguing class of yeast proteins 
that can act as protein-based epigenetic elements. Also known as prions, 
these proteins can interconvert between at least two structurally and func- 
tionally distinct states, at least one of which adopts a self-propagating 
aggregated state. A switch to the aggregated prion state generates new 
phenotypic traits, which increase the phenotypic heterogeneity of yeast 
populations (Alberti et ah, 2009; Shorter and Lindquist, 2005). 

Sup35p is a translation termination factor. When Sup35p switches into a 
prion state, a large fraction of the cellular Sup35p is sequestered into 
insoluble aggregates. The resulting reduction in translation termination 
activity causes an increase in ribosomal frame-shifting (Namy et ah, 2008; 
Park et ah, 2009) and stop codon read-through (Liebman and Sherman, 
1979; Patino et ah, 1996; Paushkin et ah, 1996). The sequences downstream 
of stop codons are highly variable, and this, in turn, facilitates the sudden 
generation of new phenotypes by the uncovering of previously hidden 
genetic variation (Eaglestone et ah, 1999; True and Lindquist, 2000; True 
et ah, 2004). The other well-understood prion protein, Ure2p, regulates 
nitrogen catabolism through its interaction with the transcriptional activator 
Gln3p. Its prion state, [URE3], causes the constitutive activation of Gln3p 
and consequently gives cells the ability to utilize poor nitrogen sources in 
the presence of a rich nitrogen source (Aigle and Lacroute, 1975; Wickner, 
1994). 

The epigenetic properties of Sup35p and Ure2p reside in structurally 
independent prion-forming domains (PrDs) with a strong compositional 
bias for residues such as glutamine and asparagine (Edskes et ah, 1999; Li and 
Lindquist, 2000; Santoso et ah, 2000; Sondheimer and Lindquist, 2000). 
This observation stimulated the first sequence-based query for additional 
yeast prions (Sondheimer and Lindquist, 2000) that ultimately lead to the 
identification of an additional prion protein called Rnqlp. The prion form 
of Rnqlp, [PJVQ+], was found to underlie the previously characterized 
non-Mendelian trait [PIN-h], which facilitates the de novo appearance of 
[URE3] and [PSI+] (Derkatch et ah, 2000, 2001). Rnqlp, however, not 
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only induces benign conformational transitions of other prion proteins, it 
can also induce toxic conformational changes of glutamine-expansion 
(polyQ) proteins that cause Huntington's disease and several other late- 
onset neurodegenerative disorders. When polyQ fragments are expressed in 
yeast, they create a polyQ length-dependent toxicity that is accompanied by 
the formation of visible amyloid-containing aggregates in a [.RNQ+]- 
dependent manner (Duennwald et ah, 2006a; Krobitsch and Lindquist, 
2000). These features have allowed yeast prions to be widely used as 
model systems for the study of protein aggregation in vivo. They were 
tremendously useful for unraveling several aspects of protein aggregation, 
including the formation of structural variants, known as prion strains, or the 
presence of transmission barriers between related prion proteins (Tessier and 
Lindquist, 2009). 

Protein aggregation is a highly complex process that is determined by 
intrinsic factors, such as the sequence and structure of the aggregation-prone 
protein, as well as extrinsic factors, such as temperature, salt concentration, 
and chaperones (Chiti and Dobson, 2006; Rousseau et ah, 2006). A partic- 
ular type of aggregation underlies the self-perpetuating properties of prions. 
All biochemically well-characterized yeast prions adopt an amyloid confor- 
mation. Amyloid is a highly ordered fibrillar aggregate. The fibril core is a 
continuous sheet of /J-strands that are arranged perpendicularly to the fibril 
axis. The exposed /J-strands at the ends of the fibril allow amyloids to 
polymerize by the continuous incorporation of polypeptides of the same 
primary sequence. This extraordinary self-templating ability allows prions 
to generate and multiply a transmissible conformational state (Caughey 
et ah, 2009; Ross et ah, 2005; Shorter and Lindquist, 2005). Prions can 
spontaneously switch to this transmissible state from a default conforma- 
tional state that is usually soluble. 

In a recent systematic attempt to discover new prions in yeast we used the 
unusual amino acid biases of known PrDs to predict novel prionogenic 
proteins in the Saccharomyces cerevisiae proteome. We subjected 100 such 
prion candidates to a range of genetic, cell biological, and biochemical assays 
to analyze their prion- and amyloid- forming propensities, and determined that 
at least 24 yeast proteins contain a prion-forming domain (Alberti et ah , 2009) . 
Our findings indicate that prions play a much broader role in yeast biology and 
support previous assumptions that prions buffer yeast populations against 
environmental changes. The fact that prions are abundant in yeast suggests 
that prions also exist in other organisms. Moreover, many more examples of 
functional aggregation, which do not involve an epigenetic mechanism for the 
inheritance of new traits, are likely to be discovered. The methods and 
techniques described here have vastly increased our understanding of protein 
aggregation in yeast. They will allow us to identify additional aggregation- 
prone proteins in yeast and other organisms and will advance our understand- 
ing of the pathological and nonpathological functions of aggregates. 
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2. Methods 



2.1. Detecting protein aggregation in yeast cells 



Protein aggregates are formed when large numbers of polypeptides cooper- 
ate to form nonnative molecular assemblies. These structures are highly 
diverse, with differences in the amount of j6-sheet content, their overall 
supramolecular organization and their ability to induce the coaggregation of 
other proteins (Chiti and Dobson, 2006; Rousseau et ah, 2006). Notwith- 
standing the multifactorial nature and complexity of protein aggregation, 
protein aggregates can simplistically be classified as disordered (amorphous) 
or ordered (amyloid-like). Amorphous aggregates are generally not well 
characterized due to their tremendous structural plasticity. Recent studies, 
however, indicate that the constituent proteins of some amorphous aggre- 
gates have a conformation that is similar to their native structure in solution 
(Qin et ah, 2007; Vetri et ah, 2007). Ordered aggregates, on the other hand, 
contain greater amounts of j6-sheet content and form densely packed 
amyloid fibers. Amyloid-like aggregates can be distinguished from disor- 
dered aggregates based on their resistance to physical and chemical pertur- 
bations that affect protein structure, such as increased temperature, ionic 
detergents or chaotropes. In the following section, we provide a variety of 
methods for the analysis of diverse protein aggregates in yeast cells. 

2.1.1. Fluorescence microscopy and staining of 
amyloid-like aggregates 

Aggregating proteins coalesce into microscopic assemblies that can be 
visualized by fluorescence microscopy. This very convenient method for 
following aggregation in cells has greatly expanded our understanding of 
protein misfolding diseases and other nonpathological aggregation-based 
phenomena such as prions (Garcia-Mata et ah, 1999; Johnston et ah, 1998; 
Kaganovich et ah, 2008). Two different experimental approaches are avail- 
able for the direct visualization of protein aggregates in cells: (1) The 
aggregate-containing cells can be fixed and treated with an antibody specific 
to the aggregation-prone protein, or (2) the aggregation-prone protein can 
be expressed as a chimera with a fluorescent protein (FP). Either approach 
has both advantages and disadvantages. Immunofluorescence microscopy 
requires an extensive characterization of the antibody to determine its 
specificity and to ensure that it is able to recognize the aggregated state of 
the protein. The benefit, however, is that the aggregation-prone protein 
can be expressed unaltered. In fact, tagging of an aggregation-prone protein 
with an FP can severely interfere with its aggregation propensity by chang- 
ing its overall solubility. The FP tag could also sterically interfere with the 
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formation of the amyloid structure. Moreover, some FPs are known to self- 
interact. Particularly problematic in this regard are older versions of DsRed, 
although we have observed that in rare cases GFP fusions can also cause 
spurious aggregation. Generally, GFP-driven aggregation does not react 
with thioflavin T (ThT) (discussed below) and forms a characteristic well- 
defined high molecular weight species when analyzed by semidenaturing 
detergent— agarose gel electrophoresis (SDD— AGE) (described in 
Section 2.1.3). Experiments with FP chimeras, therefore, require some 
caution in the interpretation of the results and we recommend performing 
additional assays to independently establish whether a protein is aggregation- 
prone or not. 

Despite these drawbacks, using FP-chimeras to study protein aggrega- 
tion has several advantages, particularly cost-effectiveness, ease of use, and 
rapid generation of results. Diverse yeast expression vectors are now avail- 
able for the tagging of aggregation-prone proteins with FPs, most of which 
are suitable for determining the aggregation propensities of a protein 
(Alberti et ah, 2007, 2009; Duennwald et ah, 2006a; Krobitsch and 
Lindquist, 2000). The choice of the promoter and the copy number of 
the yeast plasmid are also important parameters that need to be considered 
carefully. Expression from high copy 2-fim plasmids is highly variable, thus 
allowing the sampling of a range of different protein concentrations. This 
property is desirable if the goal is to determine whether a protein is generally 
able to nucleate and enter an amyloid-like state in a cellular environment. 
Low-copy CEN-based plasmids and expression cassettes for integration into 
the genome, however, have more uniform expression levels, a property 
which can be useful if more consistent aggregation behavior is desired. We 
usually try to avoid constitutive promoters, as aggregation can be associated 
with toxicity and frequently triggers growth arrest or cell death. Inducible 
promoters like GAL1 are more suitable and transient expression for 6—24 h 
is usually sufficient to induce aggregation in a significant fraction of the cell 
population. 

Aggregation-prone proteins form foci that can be visualized by fluores- 
cence microscopy. Yeast cells expressing aggregation-prone proteins form 
two characteristic types of fluorescent foci (Fig. 30.1 A): (1) ring-like struc- 
tures that localize to the vacuole and/or just below the plasma membrane 
and (2) punctate structures that can be distributed all over the cytoplasm but 
preferentially reside close to the vacuole (Alberti et ah, 2009; Derkatch et ah, 
2001; Ganusova et ah, 2006; Taneja et ah, 2007; Zhou et ah, 2001). The 
fibrillar appearance of ring-like structures and their reactivity with amyloid- 
specific dyes suggests that they consist of laterally associated amyloid fibers. 
Punctate foci, on the other hand, do not always stain with amyloid-binding 
dyes, suggesting that these structures can also be of the nonamyloid or 
amorphous type (Douglas et ah, 2008). The two types of aggregation can 
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Figure 30.1 Diverse biochemical assays used to detect protein aggregation in yeast cell 
lysates. (A) Fluorescence microscopy was used to identify cellular aggregation of 
proteins that are expressed as fusions to GFP. GFP alone (left) is equally distributed 
throughout the cytosol and nucleus. Aggregation-prone proteins show annular or 
punctate fluorescent foci, resulting from the tight packing of amyloid fibrils in 
the cytosol. (B) Amyloid-containing fractions were isolated from yeast cell lysates by 
sedimentation in SDS-containing lysis buffer. The prion protein is detected in the 
total lysate, the SDS-soluble supernatant and the SDS-insoluble pellet fraction 
by immunoblotting with a specific antibody. (C) Lysates of yeast cells expressing a 
GFP-tagged prion protein were analyzed by SDD-AGE and Western blotting. 
The prion protein was detected by immunoblotting with an anti-GFP antibody. 
(D) [prion—] and [P£JOiV+] cell lysates were subjected to a filter retardation assay. 
Aggregates retained on the membrane were detected by immunoblotting. 



be distinguished based on their fluorescence intensity and their pattern of 
aggregation. Rings and large punctate foci with very bright fluorescence are 
indicative of highly ordered amyloid fibers, whereas multiple small puncta 
with low brightness usually do not react with amyloid-specific dyes and are 
therefore of the amorphous type. 

To conclusively determine whether these foci result from amorphous or 
amyloid-like aggregation, we use a staining protocol with the amyloid- 
specific dye ThT. ThT has an emission spectrum that is red-shifted upon 
amyloid-binding, therefore, allowing colocalization with aggregation- 
prone proteins that are tagged with yellow FP. However, staining yeast 
cells with ThT can lead to high background levels and thus we recommend 
performing ThT costaining experiments only for proteins with relatively 
high expression levels. To grow the yeast cells for staining with ThT, 



Analyzing Amyloid and Prion Aggregation in Yeast 715 

a culture is inoculated in the appropriate selective medium for overnight 
growth, followed by dilution and regrowth until it reaches an OD 600 of 
0.25—1.0. Then 8 ml of the culture are transferred onto a 150-ml Nalgene 
bottle-top filter (45 mm diameter, 0.2 fiM pores, SFCA membrane) and 
the solution is filtered by applying a vacuum. When the solution has passed 
through the filter, the vacuum is halted and 5 ml of freshly prepared 
fixing solution (50 mMH 2 KP0 4 , pH 6.5; 1 mMMgCl 2 ; 4% formaldehyde) 
is added to the cells on the filter. The cells are resuspended by swirling 
and the suspension is then transferred to a 15-ml tube. The cells 
are incubated at room temperature for 2 h and vortexed briefly every 
30 min. 

The fixed cells are collected by a brief centrifugation step (2 min at 
2000 rcf) and the supernatant is removed carefully and completely. The cells 
are then resuspended in 5 ml buffer PM prepared freshly (0.1 MH 2 KP0 4 , 
pH 7.5; 1 mM MgCl 2 ) and collected again by centrifugation. After the 
supernatant is removed completely the cells are resuspended in buffer 
PMST (0.1 M H 2 KP0 4 , pH 7.5; 1 mM MgCl 2 ; 1 M sorbitol; 0.1% 
Tween 20; containing lx EDTA protease inhibitor mix from Roche). 
The volume of the PMST buffer should be adjusted to generate a final cell 
density of 10 OD 600 . One hundred microliters of the cell suspension is then 
transferred to a 0.5-ml Eppendorf tube. 0.6 fA of j6-mercaptoethanol and 
20 fA of 20,000 U/ml yeast lytic enzyme (ICN, or use zymolyase at 1 mg/ml) 
are added and the spheroplasted cells are incubated on a rotating wheel 
at room temperature for 15 min for spheroplasting. The spheroplasted cells 
are then resuspended gently in 100 fA PMST, collected by centrifugation 
and the resuspension step is repeated once. Subsequently, the cells are 
incubated in PBS (pH 7.4) containing 0.001% ThT for 20 min, washed 
three times with PBS and then used immediately for fluorescence 
microscopy. 

The pattern of aggregation is protein-specific, dependent on the level 
and duration of expression and regulated by the physiological state of the 
cell. Many yeast prion proteins, for instance, proceed through a maturation 
pathway that includes an early stage with ring-like aggregation patterns and 
a later stage with punctate cytoplasmic distribution (Alberti et ah, 2009). 
Other amyloidogenic proteins such as glutamine-expanded versions of 
huntingtin exon 1 almost exclusively form punctate foci, when expressed 
at comparable levels. Interestingly, toxic and nontoxic aggregates of gluta- 
mine-expanded huntingtin have distinct subcellular aggregation patterns. 
Toxic huntingtin forms multiple punctuate foci, whereas the nontoxic 
structural variant is present in a single cytosolic focus (Duennwald et ah, 
2006a,b). 

Yeast cells have two different aggresome-like compartments to deal with 
aggregation-prone proteins such as huntingtin (Kaganovich et ah, 2008). 
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These compartments, called JUNQ and IPOD, contain predominantly 
soluble or insoluble misfolded proteins, respectively. The JUNQ is believed 
to provide a subcellular location for the proteasome-dependent degradation 
of misfolded proteins, whereas the IPOD is enriched for chaperones like 
Hspl04 and seems to be a place for the sequestration of insoluble protein 
aggregates. Targeting to either of the compartments most likely influences 
the localization of aggregation-prone proteins, but the mechanisms of 
targeting and the factors involved in the maintenance of the compartments 
remain to be determined. 

2.1.2. Sedimentation assay 

In addition to their reactivity with amyloid-specific dyes like ThT, other 
criteria can be used experimentally to determine whether intracellular 
aggregates are amyloid-like, such as their unusual resistance to chemical 
solubilization. The detergent insolubility of amyloids has been used to 
isolate amyloid-containing fractions from yeast cell lysate by centrifugation 
(Bradley et ah, 2002; Sondheimer and Lindquist, 2000). In a typical amyloid 
sedimentation experiment 10 ml of yeast cells are grown to mid-logarithmic 
phase. The cells are collected by centrifugation and washed in water. The 
cell pellet is then resuspended in 300 /A of lysis buffer (50 mMTris, pH 7.5, 
150 mM NaCl, 2 mM EDTA, 5% glycerol). To inhibit proteolysis we 
supplement the lysis buffer with 1 mM phenylmethylsulphonyl fluoride 
(PMSF), 50 mM N-ethylmaleimide (NEM) and lx EDTA-free protease 
inhibitor mix (Roche). The suspension is transferred to a 1.5-ml Eppendorf 
tube containing 300 /A of 0.5 mm glass beads. The cells are then lysed using 
a bead beater at 4 °C and are immediately placed on ice. Three hundred 
microliters of cold RIPA buffer (50 mMTris, pH 7.0, 150 mMNaCl, 1% 
Triton X-100, 0.5% deoxycholate, 0.1% SDS) is added and the lysate is 
vortexed for 10 s. Subsequently, the crude lysate is centrifuged for 2 min at 
800 rcf (4 °C) to pellet the cell debris. The sedimentation assay is performed 
by centrifuging 200 jA of the supernatant in a TLA 100-2 rotor for 30 min 
at 80,000 rpm and 4 °C using an Optima TL Beckman ultracentrifuge. 
Equal volumes of unfractionated (total) and supernatant samples are 
incubated in sample buffer containing 2% SDS and 2% /J-mercaptoethanol 
at 95 °C for 5 min. The pellet is resuspended in 200 jA of a 1:1 mixture of 
lysis buffer and RIPA buffer containing protease inhibitors and boiled in 
sample buffer under the same conditions described above. The samples are 
then analyzed by SDS— PAGE and immunoblotting with an antibody 
specific to the aggregation-prone protein. If a putative prion is analyzed, 
it should predominantly be detectable in the supernatant of prion- 
free cells and in the pellet fraction of prion-containing cells (e.g., see 
Fig. 30.1B). 
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2.1.3. Semidenaturing detergent-agarose gel electrophoresis 

The recent invention of SDD— AGE very conveniently allows the resolution 
of amyloid polymers based on size and insolubility in detergent (e.g., see 
Fig. 30. 1C) (Bagriantsev et ah, 2006). We adapted SDD— AGE for large- 
scale applications, allowing simultaneous detection of SDS-insoluble con- 
formers of tagged proteins in a large number of samples (Halfmann and 
Lindquist, 2008). This advanced version of SDD— AGE enables one to 
perform high- throughput screens for novel prions and other amyloidogenic 
proteins. 

As a first step, it is necessary to cast the detergent-containing agarose gel. 
Standard equipment for horizontal DNA electrophoresis can be used and 
the size of the gel casting tray and the comb should be adjusted according to 
the number and volume of the samples. We usually prepare a 1.5% agarose 
solution (medium gel-strength, low EEO) in 1 X TAE. The agarose solution 
is heated in a microwave until the agarose is completely dissolved. Subse- 
quently, SDS is added to 0.1% from a 10% stock. The agarose solution is 
then poured into the casting tray. After the gel has set, the comb is removed 
and the gel is placed into the gel tank. The gel is then completely submerged 
in 1 X TAE containing 0.1% SDS. 

The following lysis procedure is optimized for large numbers of small 
cultures processed in parallel, although it can be easily modified for individ- 
ual cultures of larger volume. For high-throughput analysis of yeast lysates 
we use 2 ml cultures grown overnight with rapid agitation in 9 6- well 
blocks. The cells are harvested by centrifugation at 2000 rcf for 5 min and 
then resuspended in water. After an additional centrifugation step and 
removal of the supernatant, the cells are resuspended in 250 fA spheroplast- 
ing solution (1.2 M D-sorbitol, 0.5 mM MgCl 2 , 20 mM Tris, pH 7.5, 
50 mM /J-mercaptoethanol, 0.5 mg/ml zymolyase, 100T) and incubated 
for 1 h at 30 °C. The spheroplasted cells are collected by centrifugation at 
800 rcf for 5 min and the supernatant is removed completely. The pelleted 
spheroplasts are then resuspended in 60 fA lysis buffer (20 mMTris, pH 7.5, 
10 mM /?-mercaptoethanol, 0.025 U/fA benzonase, 0.5% Triton X-100, 
2x HALT protease inhibitor from Sigma- Aldrich) . The 96-well block is 
covered with tape and vortexed at high speed for 1 min and then incubated 
for an additional 10 min at room temperature. The cellular debris 
is sedimented by centrifugation at 4000 rcf for 2 min and the supernatant 
is carefully transferred to a 96-well plate. As a next step, 4x sample buffer 
(2x TAE; 20% glycerol; 8% SDS; bromophenol blue to preference) is 
added to a final concentration of 1 X , followed by brief vortexing to mix. 

In SDS-containing buffers amyloid-like aggregates are stable at room 
temperature, but can be disrupted by boiling. Therefore, samples are incu- 
bated for an additional 10 min at room temperature, or, as a negative 
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control, incubated at 95 °C. Most amyloids will be restored to monomers 
by the 95 °C treatment. The samples are then loaded onto the agarose gel. 
We usually also load one lane with prestained SDS— PAGE marker, enabling 
us to verify proper transfer and to estimate the size of unpolymerized SDS- 
soluble conformers. In addition, it is important to use protein aggregation 
standards, \psi- ] and [P5/+] cell lysates or lysates of yeast cells overexpres- 
sing the huntingtin length variants Q25 (nonamyloid) and Q103 (amyloid) 
can be used for this purpose. The electrophoresis is performed at low 
voltage (< 3 V/cm gel length) until the dye front reaches ~ 1 cm from the 
end of the gel. It is important that the gel remains cool during the run since 
elevated temperatures can reduce the resolution. 

For the blotting procedure, we prefer a simple capillary transfer using a 
dry stack of paper towels for absorption. One piece of nitrocellulose and 
eight pieces of GB002 blotting paper (or an equivalent substitute) are cut to 
the same dimensions as the gel. An additional piece of GB002, which serves 
as the wick, is cut to be about 20 cm wider than the gel. The nitrocellulose, 
wick, and four pieces of GB002 are immersed in 1 X TBS (0.1 MTris— HC1, 
pH 7.5, 0.15 M sodium chloride). A stack of dry folded paper towels is 
assembled that is about 2 cm thick and the same length and width of the gel. 
On top of the stack of paper towels four pieces of dry GB002 are placed, 
then one piece of wet GB002, and finally the wet nitrocellulose. The gel in 
the casting tray is briefly rinsed in water to remove excess running buffer. It 
is then carefully moved from the tray onto the stack. We recommend 
adding extra buffer on the nitrocellulose to prevent bubbles from becoming 
trapped under the gel. The remaining three prewetted GB002 pieces are 
placed on top of the gel. To ensure thorough contact between all layers, a 
pipette should be rolled firmly across the top of the stack. The transfer stack 
is subsequently flanked with two elevated trays containing 1 X TBS and the 
prewetted wick is draped across the stack such that either end of the wick is 
submerged in 1 X TBS. Finally, the assembled transfer stack is covered with 
an additional plastic tray bearing extra weight (e.g., a small bottle of water) 
to ensure proper contact between all layers of the stack. The transfer should 
proceed for a minimum of 3 h, although we generally transfer over night. 
After the transfer the membrane can be processed by standard immunode- 
tection procedures. 

2.1.4. Filter retardation assay using yeast protein lysates 

Another convenient method for analyzing aggregation is the size-depen- 
dent retention of aggregates on nonbinding membranes (Fig. 30. ID). This 
assay was initially developed to investigate amyloid formation of huntingtin 
in in vitro aggregation assays (Scherzinger et ah, 1997), but it can also be used 
to detect aggregates in yeast cell lysates. Cells should be processed as for 
SDD— AGE (Section 2.1.3), except that the sample buffer is omitted. Instead, 
the lysates are treated with the desired detergent- or chaotrope-containing 
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buffer. Generally, we use SDS at 0.1—2%. Samples are incubated at room 
temperature for 10 min, or, for a negative control, boiled in 2% SDS. 

During this incubation period the vacuum manifold is prepared. First, a 
thin filter paper (GB002) that is soaked in water is placed on the manifold. 
Then, a cellulose acetate membrane (pore size 0.2 /mi) is soaked in PBS 
containing 0.1% SDS and placed on top of the filter paper on the manifold. 
The manifold is closed and the samples are loaded into the wells of the 
manifold. The samples are filtered through the membrane by applying a 
vacuum and the membrane is washed five times with PBS containing 0.1% 
SDS. The cellulose acetate membrane can then be used for immunodetec- 
tion with a protein-specific antibody. To demonstrate that equal amounts of 
protein were present in the samples, the same procedure can be repeated 
with a protein-binding nitrocellulose membrane. As with SDD— AGE, it is 
important for filter retardation experiments to include protein aggregation 
standards, such as \psi- ] and [PS/+] cell lysates or lysates of yeast cells 
overexpressing the huntingtin length variants Q25 and Q103. 

2.2. Assays for prion behavior 

Prions are amyloids that are transmissible. Fragments of amyloid fibrils can 
be passed between cells or organisms, and the self-templating ability of 
amyloid results in the amplification of the structure, giving prions an 
infectious property. The prion properties of yeast prions reside in structur- 
ally independent PrDs. The PrDs of Sup35p and other prions are modular 
and can be fused to nonprion proteins, thereby creating new protein-based 
elements of inheritance (Li and Lindquist, 2000). We employ two assays 
which exploit this property of prions to experimentally determine whether 
a predicted PrD can confer a heritable switch in the function of a protein. 
The assays are based on the well-characterized prion phenotypes of the 
translation termination factor Sup35p and the nitrogen catabolite regulator 
Ure2p. 

2.2.1. Sup35p-based prion assay 

Sup35p consists of an N-terminal PrD (N), a highly charged middle domain 
(M) and a C-terminal domain (C), which functions in translation termina- 
tion. Both the N and M domains are dispensable for the essential function of 
Sup35p in translation termination. The charged M domain serves to 
increase the solubility of the amyloid-forming N domain, thereby promot- 
ing the conformational bistability of the Sup35p protein. In the prion state, a 
large fraction of the cellular Sup35p is sequestered into insoluble aggregates, 
resulting in reduced translation termination activity and an increase in the 
read-through of stop codons. Premature stop codons in genes of the adenine 
synthesis pathway, which are present in lab strains such as 74D-694, provide 
a convenient way to monitor switching of Sup35p to the prion state 
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(Fig. 30. 2A). In prion-free \psi— ] cells, translation termination fidelity is 
high, leading to the production of truncated and nonfunctional Adelp. As a 
consequence the [psi— ] cells are unable to grow on adenine-free medium 
and accumulate a by-product of the adenine synthesis pathway that confers a 
red colony color when grown on other media such as YPD. Prion-contain- 
ing [PSIH-] cells, on the other hand, have a reduced translation termination 
activity that allows read-through of the adel nonsense allele and the pro- 
duction of functional full-length Adelp protein, resulting in growth on 
adenine-deficient medium and the expression of a white colony color. 

The modular nature of prion domains enables the generation of chi- 
meras between the C or MC domain of Sup35p and candidate PrDs that can 
then be tested for their ability to generate [PS J+] -like states (Fig. 30. 2A). 
For several years plasmids have been available that allow the cloning of a 
candidate PrD N terminal to the C or MC domain of Sup35p (Osherovich 
and Weissman, 2001; Sondheimer and Lindquist, 2000). These plasmids can 
be integrated into the yeast genome to replace the endogenous SUP35 
gene. This procedure, however, is very laborious and has a low success rate, 
as the resulting strains frequently express the chimera at levels much lower 
than wild-type Sup35p. As a consequence, these strains aberrantly display a 
white colony color and show constitutive growth on medium lacking 
adenine, preventing their use in prion selection assays. 

To overcome these difficulties, we recently developed a yeast strain 
(YRS100) in which a deletion of the chromosomal SUP35 is covered by 
a Sup35p-expressing plasmid (Alberti et al, 2009). When these cells are 
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Figure 30.2 Using Sup35p to detect phenotype switching behavior. (A) Schematic 
representation of the Sup35p-based prion assay. The PrD of Sup35p is replaced with a 
candidate PrD and the resulting strains are tested for the presence of prion phenotypes, 
such as the switching between a red and a white colony color. (B) Relationship between 
the cellular concentration of Sup35p, the colony color in the {psi— ] state and the ability 
to grow on synthetic medium lacking adenine. 
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transformed with expression plasmids for PrD-SUP35C chimeras, a URA3 
marker on the covering SUP 3 5 plasmid allows it to be selected against in 
5-FOA-containing medium (plasmid shuffle). The resulting strains contain 
PrD-Sup35C fusions as their only source of functional Sup35p. This plas- 
mid-based expression system is more versatile than the previous versions, as 
it easily allows the use of different promoters and therefore a better control 
over the expression level of PrD-Sup35 proteins. We have generated 
vectors with four different promoters, which have the following relative 
strength of expression: SUP35 < TEF2 < ADH1 < GPD. To minimize 
the time needed for strain generation, these vectors enable recombination- 
based cloning using the Gateway system. In recent years, the Gateway 
system has emerged as a very powerful cloning method that allows for the 
rapid in vitro recombination of candidate genes into diverse sets of expres- 
sion vectors. We have generated hundreds of Gateway -compatible yeast 
expression vectors, each with a different promoter, selectable marker or tag 
(Alberti et al, 2007, 2009). This technological improvement now allows us 
to perform high- throughput testing of candidate PrD libraries for prion 
properties. All Gateway-compatible plasmids described here are available 
through the nonprofit plasmid repository Addgene (www.addgene.org). 

The SUP35C and related fusion assays were instrumental in identifying 
new prions and prion candidates (Alberti et al, 2009; Nemecek et ah, 2009; 
Osherovich and Weissman, 2001; Sondheimer and Lindquist, 2000). How- 
ever, it can be difficult to work with PrD-Sup35 chimeras and it is therefore 
important to know about the shortcomings of this assay. Whether a 
Sup35p-based prion selection experiment will be feasible or not critically 
depends on the expression level of the PrD-Sup35C fusion protein and its 
functional activity in translation termination. Too low or too high levels of 
active Sup35p result in permanently elevated levels of stop codon read- 
through, with corresponding [prion— -] strains that are able to grow on 
adenine-deficient medium and a colony color that is shifted to white 
(Fig. 30. 2B). It is, therefore, important to find the appropriate window of 
expression to generate strains with adequate translation termination fidelity. 
In our lab, we obtained the best expression results with the ADH1 
promoter. 

In general, a sufficient level of translation termination activity is indi- 
cated by a light to dark red colony color. Strains with pink or white colony 
colors usually have too high levels of translation termination activity and can 
thus not be used in prion selection assays that test for the ability to grow on 
adenine-deficient synthetic media (Fig. 30. 2B). In rarer cases it is possible 
that a high level of read- through is caused by constitutive aggregation of the 
PrD and not by insufficient expression levels. To rule out that the PrD- 
Sup35C chimera is already present in an aggregated state, an SDD— AGE 
followed by immunoblotting with an anti-Sup35p antibody should be 
performed. It is also important to point out that the size of the PrD that is 
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fused to Sup35p is a key factor that determines functionality of the chimera. 
Based on our experience, PrDs between 60 and 250 amino acids are well 
tolerated. PrDs above this threshold, however, tend to inhibit the transla- 
tion termination activity of Sup35p. In some cases, it can be important to 
include a solubilizing domain such as the M domain between the PrD and 
the C domain, as the presence of M could slow down the aggregation 
kinetics, a property that is particularly desirable if a protein is very aggrega- 
tion-prone. 

To generate a PrD-Sup35C-expressing strain, we introduce the 
corresponding expression plasmid into the YRS100 strain and select the 
transformants on appropriate selective plates. The transformants are isolated, 
grown in liquid medium for a few hours and then plated on 5-FOA plates to 
counterselect the covering SUP35 plasmid. Colonies growing on 5-FOA 
plates are streaked on YPglycerol to select for cells with functional mito- 
chondria and eliminate petite mutants that change the colony color to 
white. Subsequently, the cells are transferred onto YPD plates to assess the 
colony color phenotype. In rare cases, strain isolates expressing the same 
construct can vary in colony color. We therefore recommend isolating a 
number of different colonies and using the isolates with the predominant 
colony colors for subsequent experiments. 

At this stage, some strains might already show switching between a red 
and a white colony color on YPD plates, indicating that the PrD under 
investigation has prion properties. The strains can then be plated on SD 
medium lacking adenine to more thoroughly select for the prion state. 

ST <-l 

However, switching rates of prions can be as low as 10 —10 . Therefore, 
in cases where spontaneous switching is not observed, conformational 
conversion to the prion state should be induced. Amyloid nucleates in a 
concentration-dependent manner. Thus, transient overexpression of the 
PrD can be used to increase the switching frequency of a prion. To do 
this, we usually introduce an additional plasmid for expression of a PrD- 
EYFP fusion under the control of a galactose-inducible promoter. Induc- 
tion of the prion state is achieved by growing the resulting transformants 
with the QAL1 plasmid in galactose-containing medium for 24 h. The cells 
are then plated on YPD and SD-ade plates at a density of 200 and 50,000 per 
plate, respectively. The same strains grown in raffinose serve as a control. 
A greater number of Ade colonies under inducing conditions suggest that 
expression of cPrD-EYFP induced a prion switch. In these cases, the Ade 
colonies should be streaked on YPD plates to determine if a colony color 
change from red to white or pink has occurred. Often several colony color 
variants can be observed in one particular PrD-SUP35C strain. This varia- 
tion could be due to the presence of weak and strong prion variants, or 
"strains,' 5 that have been reported previously for other prions (Tessier and 
Lindquist, 2009). A single prion protein can generate multiple variants that 
differ in the strength of their prion phenotypes. The underlying basis for this 
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phenomenon is the presence of initial structural differences in the amyloid- 
forming nucleus that are amplified and maintained through the faithful self- 
templating mechanism of amyloids. 

After having isolated several putative prion strains that exhibit a change 
in colony color, it is important to determine whether these changes are 
based on a conformational conversion of the PrD-Sup35C protein. Known 
yeast prions critically depend on the chaperone disaggregase Hspl04p for 
propagation (Ross et ah, 2005; Shorter and Lindquist, 2005). Thus, deletion 
of the HSP104 gene or repeated streaking of the putative prion strains on 
YPD plates containing 5 mM of the Hspl04p inhibitor guanidinium 
hydrochloride (GdnHCl) are convenient ways of testing whether a color 
change is due to a prion switch. However, as some prion variants can 
propagate in the absence of Hspl04p, we suggest testing those that are 
resistant to Hspl04p inactivation for the presence of aggregated PrD- 
Sup35C by SDD— AGE and immunoblotting with a Sup35p-specific anti- 
body. We found that many strains are able to switch to a prion-like state that 
is not based on a conformational change in the PrD-Sup35 protein, but 
involve other genetic or epigenetic changes of unknown origin. To rule out 
these false positive candidates, it is very important to rigorously test putative 
prion strains for the presence of conformationally altered PrDs. 

2.2.2. Ure2p-based prion assay 

Ure2p is a 354-amino acid protein consisting of an N-terminal PrD and a 
globular C-terminal region. The C-terminal region of Ure2p shows struc- 
tural similarity to glutathione transferases and is necessary and sufficient for 
its regulatory function. Ure2p regulates nitrogen catabolism through its 
interaction with the transcriptional activator Gln3p. [URE3], the prion 
state of Ure2p, results in the constitutive activation of Gln3p, and the 
prion-containing cells acquire the ability to utilize poor nitrogen sources 
in the presence of a rich nitrogen source. One of the genes activated by 
Gln3p is the DAL5 gene. It codes for a permease that is able to transport 
ureidosuccinate (USA), an essential intermediate of uracil biosynthesis. This 
ability to take up USA has historically been used to monitor the presence of 
[URE3] (Wickner, 1994). 

The N-terminal region of Ure2p is required for its prion properties 
in vivo and deletion of the N-terminal region has no detectable effect on the 
stability or folding of the C-terminal functional part of the protein. Analo- 
gous to the Sup35p assay described in the previous section, the Ure2p PrD 
can be replaced with a candidate PrD, and the resulting chimera can then be 
tested for its ability to create heritable phenotypes that mimic [URE3] 
(Nemecek et ah, 2009). Assaying [URE3] by selection on USA-containing 
plates has several disadvantages and for this reason we use a strain that 
contains an ADE2 reporter gene that is placed under the control of the 
DAL5 promoter (Brachmann et ah, 2006). In prion-free [ure-o] cells, the 
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ADE2 gene expression is repressed and as no functional Ade2p protein is 
produced, colonies are red and fail to grow on medium lacking adenine. 
Derepression of the DAL5 promoter in [URE3] cells, however, results in a 
switch to a white colony color and the ability to grow on adenine-free 
medium (Fig. 30. 3A). 

We have also developed a yeast strain in which the chromosomal copy of 
the URE2 gene was deleted in the BY334 background (Brachmann et ah, 
2006). This strain was fully complemented by transformation of an expres- 
sion plasmid for Ure2p. In order to use this strain for the detection of novel 
prions, we generated Gateway vectors for the formation of chimeras 
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Figure 30.3 Using Ure2p to detect phenotype switching. (A) Schematic representa- 
tion of a prion detection assay based on Ure2p. See text for further details. (B) A 
chimera between the PrD of Sup35p and the C-terminal region of Ure2p shows 
switching behavior in [i£iVQ+] cells. (C) SDD-AGE of different colony color isolates 
from a Sup35PrD-Ure2C strain. (D) A chimera between the Rnqlp PrD and Ure2C 
displays colony color switching in [i£iVQ+] cells. (E) SDD-AGE of different colony 
color isolates from the plates shown in (D). 
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between a candidate PrD and Ure2C (amino acids 66—354). The C termi- 
nus of Ure2p contains an HA tag that allows the detection of PrD-Ure2C 
chimeras with HA-specific antibodies. In addition, these vectors are avail- 
able with three different promoters that can be used to express PrD-Ure2C 
chimeras at different levels (TEF2 < ADH1 < GPD). Using these vectors, 
we tested the PrDs of two well-characterized yeast prions, Sup35p and 
Rnqlp, for their ability to undergo prion switching when fused to the 
C-terminal domain of Ure2p. A chimera between the PrD domain of 
Sup35p and the C-terminal domain of Ure2p showed prion switching 
and formed weak and strong prion strains (Fig. 30. 3B and C). A fusion 
between the PrD of Rnqlp and Ure2C also behaved as a prion, with at least 
two different color variants (Fig. 30. 3D and E). Additionally, we found that 
even the full-length Rnqlp protein, when fused to Ure2C, resulted in a 
fully functional chimera that showed prion-dependent inactivation (data 
not shown). This finding suggests that the C domain of Ure2p is much more 
tolerant of larger PrDs than the C domain of Sup35p. We tested additional 
previously described PrDs for their ability to induce switching when fused 
to Ure2C and we found that many of these showed prion switching 
behavior (data not shown). 

Despite its usefulness as a tool for the detection of prion switching 
behavior, there are disadvantages associated with the Ure2C-based selection 
system. The selection for the prion state of a PrD-Ure2C chimera on 
adenine-deficient media is usually very difficult due to high levels of 
background growth. Therefore, in those cases in which a prion state cannot 
be isolated by selection, we suggest identifying prion-containing strains 
based on colony color changes on media containing adenine. We noticed 
that the PrD-Ure2C fusions readily enter an aggregated state, which is 
probably due to the fact that the C domain of Ure2p does not have a 
solubilizing effect as strong as the C domain of Sup35p. In many cases, the 
high switching rates of PrD-Ure2C fusions allowed us to readily isolate 
several prion-containing strains from a single plate. Again, it is important to 
establish that the putative prion phenotypes are based on a conformationally 
altered state of the PrD-Ure2C chimera. The methods that are most 
convenient for testing are repeated streaking on plates containing GdnHCl 
or SDD— AGE followed by immunoblotting with an HA-specific antibody. 

2.2.3. Prion selection assays 

Although yeast prions described to date cause a loss-of-function when in the 
prion state, prion switches could also induce a gain-of-function phenotype, 
as has been described for the CPEB protein that is involved in long-term 
memory formation (Si et al. , 2003) . To rigorously establish that a candidate 
protein operates as a prion in a physiologically relevant manner, robust 
prion selection assays have to be applied to isolate a prion-containing strain. 
The functional annotation of the yeast genome is a tremendously powerful 
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resource for unraveling the biology of putative prions. A wealth of data 
from genome-wide deletion and overexpression screens as well as chemical 
and phenotypic profiling studies is now available to design functional prion 
assays (Cooper et ah, 2006; Hillenmeyer et ah, 2008; Sopko et ah, 2006; Zhu 
etal, 2003). 

The transcriptional activity of a putative prionogenic transcription fac- 
tor, for example, can be monitored in a transcriptional reporter assay. We 
recently used such an approach to verify the prion properties of Mot3p 
(Alberti et ah, 2009). Mot3p is a globally acting transcription factor that 
modulates a variety of processes, including mating, carbon metabolism, and 
stress response. It tightly represses anaerobic genes, including ANB1 and 
DAN1, during aerobic growth. To analyze Mot3p transcriptional activity, 
we created Mot3p-controlled auxotrophies by replacing the ANB1 or 
DAN1 ORFs with URA3. The resulting strains could not grow without 
supplemental uracil due to the Mot3p-mediated repression of URA3 
expression. However, URA3 expression and uracil-free growth could be 
restored upon reduction of Mot3p activity by deletion of MOT3 or by 
inactivation via prion formation. 

To select for the Mot3p prion state, we transiently overexpressed the 
Mot3p PrD, via a galactose-inducible expression plasmid, and plated the 
cells onto media lacking uracil. We isolated Ura + strains whose phenotype 
persisted even after the inducing plasmid had been lost. These putative 
prion strains were then analyzed using a variety of prion tests, including 
curing by inactivation of Hspl04, testing for non-Mendelian inheritance in 
mating and meiotic segregation experiments and probing for an aggregated 
form of Mot3p in prion-containing cells. Candidate-tailored prion selection 
assays analogous to the one developed for Mot3p allow the study of prion in 
their natural context, thus providing valuable insights into the biological 
functions of prion conformational switches. 



2.2.4. Methods to analyze prion inheritance 

Yeast prions are protein-based epigenetic elements that are inherited in a 
non-Mendelian manner. The unusual genetic properties of yeast prions can 
be used to determine whether a phenotype is based on a prion (Ross et ah, 
2005; Shorter and Lindquist, 2005). Diploid cells that result from a mating 
between [prion—] and [PPJON+] cells are usually uniformly [PRION-f-]. 
The [PR/ONH-] phenotype emerges from the self-perpetuating nature of 
prions: when a prion-containing cell fuses with a prion-free cell, the 
efficient prion replication mechanism rapidly consumes nonprion confor- 
mers until a new equilibrium is reached that is shifted in favor of the prion 
conformer. Diploid [PPiOiV+] cells that undergo meiosis and sporulation 
normally generate [PRJONH-] progeny with a 4:0 inheritance pattern. 
However, stochastic deviations from the 4:0 pattern are possible, if the 
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number of prion-replicating units or propagons is low, causing some prog- 
eny to receive no propagons and to correspondingly lose the prion state. 

To establish whether a putative prion trait is inherited in a "protein 
only" cytoplasm-based manner through the cytoplasm, cytoduction can be 
performed. Cytoduction is an abortive mating in which the cytoplasms of a 
prion-free recipient cells and a prion-containing donor cell mix without 
fusion of the nuclei. Daughter cells with mixed cytoplasm but only one 
nucleus bud off from the zygote and can be selected (Conde and Fink, 
1976). As a result, donor cells only contribute cytoplasm, whereas recipient 
cells contribute cytoplasm and nucleus to the progeny. The recipient cells 
we use are karyogamy-deficient and carry a mitochondrial petite mutation 
termed rho . As a consequence, recipient cells cannot fuse their nuclei with 
those of the donor cell and are unable to grow on a no nfermen table carbon 
source, such as glycerol, unless they receive wild-type mitochondria from 
the donor cytoplasm. Following cytoduction, haploid progeny are selected 
that retained the nuclear markers of the recipient strain but can also grow in 
medium containing glycerol. If the aggregated state of the candidate PrD 
was successfully transmitted through the cytoplasm, the selected cytoduc- 
tants should display the prion phenotype under investigation. For a more 
detailed description of cytoduction we direct the reader to two recent 
articles on this topic (Liebman et ah, 2006; Wickner et ah, 2006). 

2.2.5. Transformation of prion particles 

The principal tenet of the prion hypothesis is that prions replicate in a 
protein-only manner, without the direction of an underlying nucleic acid 
template. The most rigorous proof for a prion, then, is to show that nucleic 
acid-free preparations of aggregated protein are "infectious" in and of 
themselves. That is, they have the capacity to convert cells to a stable 
prion state when they are introduced into those cells. Protein transforma- 
tions have irrefutably established the protein-only nature of prion propaga- 
tion for a number of yeast prions, and have also been used to show that the 
prion strain phenomenon results from conformational variations in the 
underlying amyloid structure. 

Protein transformations are generally done by fusing prion particles with 
recipient cell spheroplasts using poly ethylene-glycol (see Fig. 30.4). A 
selectable plasmid is typically cotransformed with the prion particles to 
allow for the determination of total transformation efficiency, as prion 
protein preparations have variable infectivities. The transformed sphero- 
plasts are allowed to recover in agar and then analyzed for the prion 
phenotype. The putative prion particles to be used for protein transforma- 
tions can be obtained either from [PR/ON-h] cells or from recombinant 
protein that has been allowed to aggregate in vitro. 

When using crude extracts as a source of prions, care must be taken to 
eliminate all viable nonlysed cells remaining in the extract (e.g., by 
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Figure 30.4 Protein transformations to examine transmissibility of protein aggregates 
(adapted from Tanaka and Weissman, 2006). In vitro-generated amyloid fibers, or 
alternatively, partially purified yeast extracts, can be used to transform cells to a stable 
prion state. The rigid cell wall of recipient cells is removed to generate competent 
spheroplasts. The spheroplasts are then incubated with a transformation mix containing 
prion particles and a selectable plasmid, followed by recovery of transformants on 
isotonic media that is selective for the plasmid. These transformants are then screened 
for the prion state using phenotypic or biochemical assays. 



centrifugation or filtration), as they may otherwise appear as false positives. 
For this reason, we recommend performing control transformations with- 
out recipient cells to verify that there are no contaminating cells in the 
extract. Extracts from [PRJONH-] yeast can be generated either by spher- 
oplasting (e.g., Section 2.1.3) or by glass bead lysis (Brachmann et ah, 2005; 
King et ah, 2006). Cleared lysates resulting from either of these procedures 
may be adequate for transformations without further manipulations in some 
cases. However, there are also a number of techniques that can be used to 
enrich prion particles relative to other cellular components and thereby 
improve transformation efficiencies. We direct the reader to the correspond- 
ing references concerning these procedures, which include: partial purifica- 
tion of aggregated protein by sedimentation (Tanaka and Weissman, 2006), 
sedimentation followed by affinity purification of the prion protein (King 
et ah, 2006), and amplification of prion particles in cell extracts by seeding 
the conversion of exogenously added recombinant prion protein 
(Brachmann et ah, 2005; King et ah, 2006). 

The most rigorous proof that a protein is a prion is to transform cells to 
the [PR/ON-h] state using solely recombinant protein from a heterologous 
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host that has been converted to the putative prion form in vitro. This 
procedure avoids potential confounding factors that are present in yeast 
extracts, and also allows one to easily generate highly concentrated infec- 
tious preparations without the need for labor-intensive enrichment of prion 
particles from [PR/ON-h] yeast cells. 

We routinely purify yeast PrDs from Escherichia coli and convert them to 
amyloid fibers in a near-physiological buffer (Alberti et ah, 2009). To form 
infectious amyloids in vitro, denatured proteins are diluted from a GdnHCl 
stock to a final concentration of 10 fiM in 1 ml assembly buffer (5 mM 
K 2 HP0 4 , pH 6.6, 150 mMNaCl, 5 mMEDTA, 2 mMTCEP) and rotated 
end-over-end for at least 24 h at room temperature. The formation of 
amyloid is most easily monitored by ThT fluorescence (450 nm excitation, 
482 nm emission), added at 20-fold molar excess, to aliquots taken from the 
assembly reaction. Following amyloid conversion, the reaction is centri- 
fuged at maximum speed (20,000 rcf ) in a table top centrifuge for 30 min at 
room temperature, and the pellet of aggregated protein resuspended in 
200 fA PBS. The protein is then sonicated with a tip sonicator at the lowest 
setting for 10 s. Sonication shears amyloid fibers into smaller pieces, thereby 
greatly enhancing their infectivity. 

Proper negative controls are essential for interpreting protein transfor- 
mations. Prions arise spontaneously at a low frequency and this frequency 
increases after cells are exposed to stress (Tyedmers et ah, 2008). Conse- 
quently, the efficiency of transformation to [PR/ON-h] must be normalized 
against mock transformations, such as freshly diluted (soluble) prion protein, 
amyloid fibers of other prions, or [prion—] cell extract. 

Recipient cells are prepared for protein transformations by a gentle 
enzymatic removal of the cell wall (spheroplasting) , using a protocol 
adapted from Tanaka and Weissman (2006). Many prions have an increased 
rate of appearance in yeast cells harboring the [PBV+] prion. For this reason, 
we recommend using [pin—] yeast to reduce background from the sponta- 
neous appearance of the prion state of interest. Yeast are grown to an OD of 
0.5 in YPD, harvested by centrifugation, and washed twice in sterile 
distilled water. The cells are then washed with 1 ml SCE (1 M sorbitol, 
10 mM EDTA, 10 mM DTT (added just before use), 100 mM sodium 
citrate, pH 5.8) and then resuspended in 1 ml SCE. Sixty microliters lyticase 
solution (4.2 mg/ml lyticase (Sigma); 50 mM sodium citrate, pH 5.8) is 
added and the cells are incubated for 20—30 min at 30 °C while shaking at 
300 rpm. It is very important that this step not be allowed to proceed for too 
long, or cells will lose viability. We recommend standardizing this step using 
identical aliquots of lyticase solution prepared from a single lot, which are 
stored at —80 °C. The progress of spheroplasting can be monitored during 
this incubation by placing 2 fA of cells in 20 fA of 1% SDS and observing 
them under a microscope. Spheroplasts with SDS should lyse and be 
invisible or appear as ghost cells. 
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The spheroplasts are harvested at 500 rcf for 3 min at room temperature, 
followed by washing twice with 1 ml of STC buffer (1 M sorbitol, 10 mM 
CaCl 2 , 10 mMTris— HC1, pH 7.5). Finally, the spheroplasts are resuspended 
in 0.5 ml of STC buffer. Spheroplasts are sensitive to shear forces and 
consequently must be handled gently during all manipulations. To resus- 
pend spheroplasts, we use a 1-ml plastic pipette tip that has ~ 1 cm of the tip 
removed. 

We add 100 jA of spheroplasts to 4 /A of 10 mg/ml salmon sperm DNA, 
25 jA of 0.1 mg/ml selectable plasmid (e.g., pRS316 for a URA3-marked 
plasmid) and 33 fA of the protein solution to be transformed. The final 
protein concentration of amyloid fibers should be ~ 10 /iM, or if using yeast 
extract, 200—400 /ig/ml total protein. The samples are tapped gently to mix 
and incubated for 30 min at room temperature. Next, proteins are fused to 
spheroplasts by adding 1.35 ml PEG-buffer (20% (w/v) PEG 8000, 10 mM 
CaCl 2 , 10 mM Tris— HC1, pH 7.5) and incubating for 30 min at room 
temperature. Note that the optimal concentration and molecular weight of 
PEG used in this step may vary depending on the transformed protein (Patel 
and Liebman, 2007). Spheroplasts are collected at 500 rcf for 3 min at room 
temperature, and resuspended in 0.5 ml of SOS buffer (1 M sorbitol, 7 mM 
CaCl 2 , 0.25% yeast extract, 0.5% bactopeptone), followed by incubation for 
1 h at 30 °C with 300 rpm shaking. Meanwhile, 8 ml aliquots of spheroplast 
recovery media are prepared in 15 ml tubes and maintained in a 48 °C water 
bath. The spheroplast recovery media needs to be selective for the plasmid 
(e.g., SD-ura) and for the prion state if desired (see below), and is supple- 
mented with 1 M sorbitol and 2.5% agar. Each transformation reaction is 
diluted into one aliquot of media, mixed by gentle inversion, and overlayed 
immediately onto the appropriate selective plates that have been prewarmed 
to 37 °C. 

Plates are incubated at 30 °C under high humidity until colonies develop 
(up to 1 week). Colonies can then be picked out of the agar and scored for 
[PR/OiV+] phenotypes. If transformation efficiencies are low, it is espe- 
cially important that putative [PPiOiV+] transformants are verified by 
secondary assays like SDD— AGE, to distinguish them from genetic rever- 
tants or other background colonies. 

We and others (Brachmann et al. , 2005) have found that selecting directly 
for the [PR/ONH-] state of some prions (e.g., [MOT3+] and [URE3]) during 
spheroplast recovery increases conversion to the [PRJONH-] state relative to 
delaying selection until after the cells have recovered. Newly induced prion 
states are often initially unstable and seem to be lost at a high frequency under 
nonselective conditions. Consequently, applying an immediate mild selective 
pressure during the spheroplast recovery step can improve the apparent 
transformation efficiency by preventing many prion-containing spheroplasts 
from losing the prion state during colony formation in the sorbitol-containing 
media. However, stringent selective conditions can also inhibit spheroplast 
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recovery resulting in drastically reduced transformation efficiencies. For 
instance, [URE3] spherop lasts recover at a low frequency when plated directly 
to USA containing media (Brachmann et al, 2005), and we have observed 
that [PS1H-] spheroplasts generally recover poorly in adenine-deficient media. 
For each prion and selection scheme, there is likely to be an optimum 
window of selection stringency that maximizes the number of [PRJOiV+] 
transformants recovered. 




3. Concluding Remarks 

Aggregation has been suggested to be a generic property of proteins 
(Chiti and Dobson, 2006), but most proteins aggregate only under condi- 
tions that fall outside of the normal physiological range. Studies of proteins 
that aggregate under nonphysiological experimental conditions have 
provided important insights into general aspects of protein aggregation. 
Yet proteins that aggregate under physiological conditions are much more 
interesting from a biological point of view. Recent studies show that 
misfolding and aggregation propensities are likely to be a dominant force 
in the evolution of protein sequences (Chen and Dokholyan, 2008; 
Drummond and Wilke, 2008). This hypothesis is further underscored by 
the presence of complex quality control mechanisms that govern the abun- 
dance and structure of protein aggregates. Studies that identify large num- 
bers of aggregation-prone proteins under physiological conditions will be 
necessary to understand how aggregation propensities shape the sequence 
and structure of proteins and the composition of proteomes. These studies 
will also allow us to generate comprehensive inventories of proteins that are 
capable of forming functional or pathological aggregates. Such inventories 
will be an important asset, as their analysis will facilitate the identification of 
sequence determinants that drive aggregation behavior. A growing toolbox 
of scalable and adaptable protein aggregation assays now enables rapid 
identification and characterization of the repertoire of aggregation-prone 
proteins in yeast. Moreover, they place yeast at the vanguard of new 
technological developments that have a tremendous impact on our under- 
standing of fundamental aspects of biology. 
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Abstract 

Candida albicans is an opportunistic fungal pathogen of humans. Although a 
normal part of our gastrointestinal flora, C. albicans has the ability to colonize 
nearly every human tissue and organ, causing serious, invasive infections. 
In this chapter we describe current methodologies used in molecular genetic 
studies of this organism. These techniques include rapid sequential gene 
disruption, DNA transformation, RNA isolation, epitope tagging, and chromatin 
immunoprecipitation. The ease of these techniques, combined with the high- 
quality C. albicans genome sequences now available, have greatly facilitated 
research into this important pathogen. 

Candida albicans is a normal resident of the human gastrointestinal tract; it is 
also the most common fungal pathogen of humans, causing both mucosal and 
systemic infections, particularly in immune compromised patients. C. albicans 
and Saccharomyces cerevisiae last shared a common ancestor more than 
900 million years ago; in terms of conserved coding sequences, the two species 
are approximately as divergent as fish and humans. Although C. albicans and 
5. cerevisiae share certain core features, they also exhibit many significant 
differences. This is not surprising as C. albicans has the ability to survive in 
nearly every niche of a mammalian host, a property not shared by 5. cerevisiae. 
Research into C. albicans is important in its own right, particularly with regards 
to its ability to cause disease in humans; in addition, comparison with 
5. cerevisiae can reveal important insights into evolutionary processes. 

Many of the methodologies developed for use in 5. cerevisiae have been 
adapted for C. albicans, and we describe some of the most common. Although 
alternative procedures are described in the literature, we have found those 
described below to be the most convenient. Because the C. albicans parasexual 
cycle is cumbersome to use in the laboratory, genetics in this organism has 
been based almost entirely on directed mutations. Because the organism is 
diploid, creating a deletion mutant requires two rounds of gene disruption. We 
describe a rapid method for creating sequential disruptions, one which can be 
scaled up to create large collections of C. albicans deletion mutants. We also 
describe a series of additional techniques including DNA transformation, mRNA 
isolation, epitope tagging, and chromatin immunoprecipitation (ChIP). The ease 
of these techniques, combined with the high-quality C. albicans genome 
sequences now available, has greatly increased the quality and pace of 
research into this important pathogen. 




1. Homozygous Gene Disruption in C. albicans 

Creating gene knockout mutants in C. albicans typically involves two 
rounds of transformation (to disrupt both alleles of a given gene) with a 
linear fragment of DNA bearing a selectable marker as well as sequences 
identical (or nearly identical) to those sequences flanking the target gene 
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Figure 31.1 (A) Homozygous gene disruption by two rounds of transformation and 
homologous recombination. (B) Fusion PCR method. 

(Fig. 3 1.1 A). Approximately 60 nucleotides of flanking sequence on each 
side of the selectable marker approaches the minimum necessary for 
successful targeting, and the efficiency appears to improve with increasing 
lengths. The disruption cassette can be created by PCR or by traditional 
cloning, and the available selectable markers include multiple auxotrophic 
markers (such as HIS1, LEU2, ARG4, and URA3) and an antibiotic 
resistance gene (SAT1); note that the URA3 marker should be used with 
care, because important C. albicans phenotypes such as morphogenesis and 
virulence are strongly dependent on the levels of URA3 expression. Trans- 
formants are selected on appropriate media and then screened for integra- 
tion of the disruption cassette at the correct genomic locus. Following 
disruption of the second allele, verification that the target ORF is truly 
deleted (achieved by PCR or Southern blotting) is crucial, as extra copies of 
chromosomes can arise during the tranformation procedures. 

Following is a streamlined protocol based on fusion PCR that results in a 
high efficiency of gene disruption (Noble and Johnson, 2005; Fig. 31. IB). 
Maximal efficiency is achieved by the use of auxotrophic markers from non- 
albicans Candida species or bacterial antibiotic resistance genes as markers, these 
strategies reduce integration events at "off-target" locations in the C. albicans 
genome. The first round of PCR consists of three reactions: two to amplify 
DNA upstream and downstream of the target gene, and a third to amplify the 
selectable marker. Importantly, certain primers (indicated in Fig. 31.1 and 
described in detail in the protocol) contain complementary tails. In the second 
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round of PCR, the three products of the first round of PCR serve as an 
aggregate template, resulting in a single product. Note that specific reagents 
and kits are recommended, but alternates can be substituted. We have found 
that Ex Taq (Takara) and Klentaq LA (DNA Polymerase Technology) yield 
better results than other commercial enzymes for fusion PCR. 

1.1. Homozygous gene disruption by fusion PCR 

The following auxotrophic markers are available as cloned genes from 
non-albicans Candida species (Noble and Johnson 2005): 

C. dubliniensis HIS1 = pSN52 

C. maltosa LEU2 = pSN40 

C. dubliniensis ARG4 = pSN69 

1. Design the following PCR primers: 

Gene disruption primers (Fig. 31. IB): 

1 — primer to gene of interest (beginning of 5 7 flank top, ~350 bp 
upstream of ORF) 

2* — CCGCTGCTAGGCGCGCCGTG-selectable marker (5 7 marker 
top) 

3— CACGGCGCGCCTAGCAGCGG-gene of interest (end of 5' 
flank bottom) 

4 — GTCAGCGGCCGCATCCCTGC-gene of interest (beginning of 
3 / flank top) 

5*— GCAGGGATGCGGCCGCTGAC-selectable marker (3' marker 
bottom) 

6 — primer to gene of interest (end of 3' flank bottom, ^350 bp down- 
stream of ORF) 

* If using auxotrophic markers in the pSN series (Noble and Johnson), 
sequences of primer 2 and primer 5 are: 

Primer 2 ccgctgctaggcgcgccgtgACCAGTGTGATGGATATCTGC 
Primer 5 gcagggatgcggccgctgacAGCTCGGATCCACTAGTAACG 

Knockout verification primers (Fig. 31.2): 

Target Check Left — -just upstream of primer 1 

Target Check Right — -just downstream of primer 6 

ORF Left — internal to the deleted ORF 

ORF Right — internal to the deleted ORF 

Marker Check Left* — toward end of selectable marker that is near 

primer 2 
Marker Check Right* — toward end of selectable maker that is near 

primer 5 
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Target gene 
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Figure 31.2 Primers for verification PCR. 

*If using auxotrophic markers in the pSN series, one can use the 
following Marker Check primers: 

C. dubliniensis HIS1 Check Left ATTAGATACGTTGGTGGTTC 
C. dubliniensis HIS1 Check Right AACACAACTGCACAATCTGG 
C. maltosa LEU2 Check Left AGAATTCCCAACTTTGTCTG 
C. maltosa LEU2 Check Right AAACTTTGAACCCGGCTGCG 
C. dubliniensis Check Left TTCAACCTTTCAAACGATGC 

C. dubliniensis Check Right TCGATACATTTGCGGTACAG 

2. Set up reactions for PCR Round I on ice, and run the PCR: 

Reaction 1 = primers 1+3 using genomic DNA as template 
Reaction 2 = primers 4 + 6 using genomic DNA as template 
Reaction 3 = primers 2 + 5 using auxotrophic marker as template, 
for example, pSN52 = C. dubliniensis HIS1 

PCR reaction (50 \il): 

5 jA 10 X Ex Taq buffer 

4/il2.5mMdNTPs 

36 }A H 2 

< 1 iA GENOMIC DNA (reactions 1 and 2) or < 1 /il plasmid DNA 

(reaction 3) 
2 [A 1st primer (5 juM) 
2 jA 2nd primer (5 jiM) 
0.5 jA Ex Taq polymerase 

PCR conditions: 

94 °C 5 min 

35 cycles 94 °C 30 s, 45 °C 45 s, 72 °C 1 min (flanks) or 4 min (marker) 

72 °C 10 min 

10 °C forever 

3. Run out 5 /A PCR products on a 1% agarose gel to confirm successful 
synthesis. 
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4. Optional — Gel purify the product of Reaction 3 (marker fragment). 
Run the PCR reaction on a 1% agarose gel and cut out the correct 
sized band under long wave ultraviolet light. Recover DNA with a 
Qiagen QIAquick gel extraction kit (eluting in 50 fA H 2 or Buffer 
EB). By isolating the correct PCR product and eliminating contami- 
nants, this step increases the efficiency of the fusion reaction for certain 
targets and allows for stable storage of the marker fragment (at —20 °C). 

5. Optional — Purify the products of Reactions 1 and 2 with a Qiagen 
QIAquick PCR purification kit. Use as directed and elute in 50 fA H 2 
or Buffer EB. 

6. Set up reaction for PCR Round II on ice, and run PCR. 

PCR reaction (100 \il): 

10 fA 10 X Ex Taq buffer 

8/il2.5mMdNTPs 

0.75 fA Ex Taq polymerase 

67 fA H 2 

1.5 fA reaction 1 PCR product (5 7 flank) 

1.5 fA reaction 2 PCR product (3 7 flank) 

2 fA reaction 3 PCR product (Marker) 

4 fA primer 1 (5 fiM) 

4 fA primer 6 (5 fiM) 

Fusion PCR conditions: 

94 °C 5 min (Hot start: i.e., wait until the PCR block heats to ~ 80 °C 

before introducing PCR reactions) 
35 cycles 94 °C 30 s, 50 °C 45 s, 72 °C 4.5 min 
72 °C 10 min 
10 °C forever 

7. Run out 5 fA PCR product on a 1% agarose gel to confirm success of 
fusion reaction. 

The fusion product should be ~3— 4 kb, depending on the auxo- 
trophic marker chosen and the length of flanking sequences. There is 
typically a mix of full sized product, with a variable amount of a 
minority shorter product. 

Note: If the fusion reaction is unsuccessful, a variation is to include 
Betaine in the reaction; that is, add 20 fA of 5 Mbetaine (final 1.3 M) 
and just 47 fA H 2 in the fusion reaction mix. If betaine is used, 
decrease the PCR denaturation temperature to 92.5 °C. 

8. Optional — Purify the fusion PCR product with a Qiagen QIAquick 
PCR purification kit. 

Use kit as directed and elute in 30 fA of H 2 or Buffer EB. 
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9. Transform 10 jA of disruption fragment into fresh competent SN152 
(or any strain that is auxotrophic for the selectable marker). Plate cells 
on the appropriate dropout plate, for example, -His. Incubate plates at 
30 °C for 2 days, or until individual colonies are visible. 

Note: If selecting for Nourseothricin resistance, allow cells to 
recover by growing in YPD liquid media without selection for at 
least 5 hours at 30 °C prior to plating on selective media. 

10. Patch ~ 10 transformants onto fresh medium, and perform colony PCR 
to verify the correct 5' and 3' junctions of the disrupted allele (Fig. 31 .2). 
In separate PCR reactions, use the primer pairs Target Left + Marker 
Left and Target Right + Marker Right. Correct integrants should have 
PCR products of the expected size (~0.5 kb) with each primer set. 

11. Pick at least two confirmed heterozygous knockout candidates and 
streak for single colonies on fresh medium. 

Note: Because unlinked mutations can be acquired during strain 
construction and because two rounds of transformation are required to 
create a homozygous deletion, it is wise to obtain at least 2 independent 
isolates of any C. albicans knockout. 

12. Repeat the fusion PCR step, using the same flank products (PCR 
reactions 1 and 2) but a different selectable marker (e.g., pSN40 = 
C. maltosa LEU2). 

13. Transform 10 jA of the new disruption fragment into two independent 
heterozygous knockout strains and plate on doubly selective medium 
(e.g., -His, -Leu). 

14. Patch ^10 transformants of each strain onto fresh medium, and per- 
form colony PCR as above to confirm the appropriate 5' and 3' 
junctions of the second disrupted allele. 

15. For candidates with expected disruptions of both target alleles, perform 
a final PCR verification that there are no remaining copies of the target 
ORF. This step is necessary because aneuploidies or translocations 
commonly result in an extra copy of the target gene. Remember to 
test as a positive control a strain that retains a copy of the target gene 
(e.g., wild type or the heterozygous knockout). One should see a PCR 
product of the expected size in the positive control and no PCR 
product in the desired homozygous deletion strain. 




2. C. albicans DNA Transformation 

The following is a basic protocol for DNA transformation with 
C. albicans. Because stable extrachromosomal plasmids have yet to be fully 
developed for use in C. albicans, this protocol is typically used for transfor- 
mation and stable integration of linear DNA fragments into the C. albicans 
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genome. For efficient homologous recombination to occur, a minimum of 
60 bp of sequence identical (or nearly identical) to the genomic target locus 
is required on either end of the DNA fragment that is to be transformed. 

1. Inoculate a 5-ml culture in YEPD and grow overnight at 30 °C. 

2. Dilute 300 jA of the overnight culture into 10 ml of fresh YEPD and 
grow at 30 °C for 4—6 h (or until OD is around 0.5—1.0). 

3. Centrifuge for 2 min at ~ 1000x^ and discard supernatant. 

4. Resuspend in 900 fA LiOAc/TE and transfer to a microcentrifuge tube. 

5. Pellet for 1 min at ~ 1000x^ and discard supernatant. 

6. Wash two more times with 900 fA LiOAc/TE then resuspend in 
^400 fA final volume with LiOAc/TE. 

7. In a separate microfuge tube mix (in order) the following: 

a. 10 /il 10 mg/ml denatured Herring Sperm (or Salmon Sperm) DNA 
i. Prepare by boiling ~ 2 min then snap cooling in ice water 

b. > 1 fig of DNA fragment to be transformed (or ~ 20-50 jA of PCR 
product) 

ii. Highest transformation efficiencies are achieved if the DNA is 
NOT purified following enzymatic reactions (i.e., PCR pro- 
ducts or plasmid digests) 

c. 200 jA washed cells in LiOAc/TE 

d. 1 ml PEG mix 

8. Incubate overnight at room temperature. 

9. Heat shock 1 h at 42 °C (or 44 °C for 15 min). 

10. Pellet for 1 min at ~ 1000x^ and discard supernatant. 

1 1 . Wash one time with 1 ml sterile water. 

12. Resuspend in ~ 150 fA final volume with sterile water. 

a. For selection of Nourseothricin resistance, transfer cells to 5 ml 
YEPD and recover by incubation at 30 °C for at least 5 h prior to 
plating on selective media. 

13. Plate on selective media and incubate at 30 °C for 2—3 days. 

2.1. Transformation buffers 

LiOAc mix: 

10 mil MLiOAc 

200/il0.5MEDTA 

1 ml 1 MTris-HCl, pH 7.5 

H 2 to 100 ml 

Filter sterilize 

PEG mix: 

80 ml 50% PEG-3350 
10 ml 1 MLiOAc 
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200/il0.5MEDTA 

1 ml 1 MTris-HCl, pH 7.5 

H 2 to 100 ml 

Filter sterilize 




3. C. albicans Total RNA Purification 

Purifying total cellular RNA from liquid cultures of C. albicans is com- 
parable in most regards to purifications from 5 cerevisiae. As with S. cerevisiae, 
lysing the C. albicans cell wall requires a more vigorous procedure than does 
lysis of animal cells. The procedure outlined below includes organic extrac- 
tions in Phase Lock tubes (Eppendorf) for removal of proteins and other 
cellular material. Because C. albicans cellular debris tends to disrupt the 
Phase Lock gel matrix, the first organic extraction is performed in conven- 
tional rather than Phase Lock tubes; if desired, subsequent extraction steps 
may also be performed in conventional tubes. Purified RNA is suitable for 
most applications, including microarray analysis, quantitative RT-PCR and 
Northern hybridization. 

1. Grow a 10-ml liquid culture of C. albicans cells to an appropriate 
concentration (e.g., OD 600 of 1.0—1.5). For other volumes and cell 
densities, all steps may be scaled proportionally. 

2. Collect cells by centrifugation (2000 Xg, 5 min, 4 °C) in a 15-ml polypro- 
pylene conical tube. Remove as much liquid as possible, and freeze by 
immersing tube in liquid nitrogen. Store frozen cell pellet at — 80 °C. 

3. Transfer frozen tube containing cell pellet to ice, working quickly to 
avoid thawing prior to the addition of phenol. 

4. To frozen pellet, first add 2 ml phenol, then 2 ml extraction buffer 
(50 mM sodium acetate [from pH 5.3 stock], 10 mMEDTA and 1% 
SDS). The use of acidic (pH ~4.5) rather than neutral phenol will 
reduce, but not eliminate, DNA contamination. While working with 
phenol and chloroform, use appropriate protective equipment (goggles, gloves, lab 
coat, fume hood) and dispose of hazardous waste appropriately. 

5. Ensure that each tube is well-capped, then mix by vortexing. Transfer 
tube to 65 °C water bath. Incubate for 10 min, removing to vortex 
vigorously every minute or so. 

6. Transfer tube to ice for 5 min. Keep samples on ice for all subsequent 
steps, except where noted. 

7. Add 2 ml chloroform to tube, cap securely, then mix well by vortexing. 

8. Spin tube in tabletop centrifuge (2000 Xg, 5 min, 4 °C) to separate 
phases, along with an empty 15 ml Heavy Phase Lock Gel tube 
(Eppendorf) for use in the next step. 



746 Aaron D. Hernday et al. 

9. Carefully remove aqueous (top) phase by pipetting, avoiding as much 
material at the interface as possible. Transfer to the prespun Phase Lock 
tube along with 2 ml phenol: chloroform. (Use an equal volume 
mixture of phenol and chloroform; this can be either neutral or acidic, 
with or without isoamyl alcohol.) Cap tube and shake by hand, but do 
not vortex, as this may disrupt the gel matrix. 

10. Spin tube in tabletop centrifuge (1500 Xg, 5 min, 4 °C). Organic phase 
should partition below the gel matrix. 

1 1 . Add 2 ml chloroform, then shake and spin as before. (This step removes 
residual phenol from the sample.) 

12. From this point on, ensure that all reagents and containers are free of 
RNases. Pour aqueous phase into a fresh conical tube. Add 200 /A 3 M 
sodium acetate (pH 5.3) and 2 ml isopropyl alcohol. Shake or vortex 
briefly, then incubate at room temperature for 10 min. 

13. Pellet RNA in tabletop centrifuge at maximum speed (20 min, 4 °C). 
A substantial white pellet of RNA should be visible. Pour off superna- 
tant and let drain briefly with tube inverted on a Kim wipe. 

14. Use 800 jA 70% ethanol to transfer pellet by pipet to a 1.5-ml micro- 
fuge tube, breaking up pellet if necessary. Spin in 4 °C microcentrifuge 
at maximum speed for 5 min. 

15. Carefully remove as much liquid as possible from pellet with pipet tip, 
then air dry for a few minutes. Do not allow RNA pellet to become too 
dry, as this will make resuspension difficult. 

16. Resuspend RNA in ~200 /A RNase-free water. Pipetting up and 
down will help to resuspend the RNA; if necessary, the tube can also 
be incubated at 50 °C for 10 min. 

17. To determine the concentration of RNA in solution, measure its 
absorbance in an ultraviolet spectrophotometer. If using a NanoDrop 
(Thermo Scientific), 2 jA of solution can be measured directly. Other- 
wise, dilute 1:100 to measure. The concentration of the measured 
solution (in ng//il) is given by the absorbance at 260 nm multiplied 
by 40. Expected yield is roughly 400 fig. 

18. RNA can be stored at — 80 °C, then thawed slowly on ice for use. For 
downstream applications that may be compromised by contaminating 
DNA (such as quantitative RT-PCR), RNA should first be treated 
with an RNase-free DNase (e.g., RQ1 DNase from Promega). 




4. C-Terminal Epitope Tagging in C. albicans 

This protocol relies on homologous recombination to integrate the 
coding sequence for a C-terminal epitope tag in place of the stop codon for 
any gene at its endogenous locus. The pADH34 vector contains the coding 
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sequence for a 13 X myc repeat, while pADH52 encodes a 6-His/FLAG 
tandem affinity purification (TAP) tag. As both of these constructs use the 
same linker sequence, either tag can be amplified with a single pair of PCR 
primers. Briefly, long oligonucleotides (typically 90—120 bp total) are used 
to amplify a 4.8-kb DNA fragment which, when integrated into the 
genome, will replace the stop codon of the target gene with the epitope 
tag coding sequence, followed by the SAT1 /flipper cassette. Upon confir- 
mation of integration at the desired locus in nourseothricin resistant colo- 
nies, the SAT 1 /flipper cassette is excised, leaving only the epitope tag 
coding sequence and a minimally disruptive FLP recombinase target 
sequence behind. The SAT1 /flipper cassette and marker excision procedure 
was developed by Reuss et al. (2004). 

4.1. Primer design 

Synthesize a "forward knock-in primer" encompassing the sense strand 
sequence of the target gene up to, but not including, the stop codon. 
In place of the stop codon, add the sequence "CGGATCCCCGGGT- 
TAATTAACGG" to the 3 f end of the forward knock-in primer. To 
generate the "reverse knock-in primer," take the reverse complement of 
the sequence immediately downstream of the stop codon and add the 
sequence "GGCGGCCGCTCTAGAACTAGTGGATC" to the 3' end. 

4.2. PCR conditions 

Perform 30—35 cycles of amplification with pADH34 or pADH52 and the 
knock-in primers using Ex Taq (Takara) or a similar increased fidelity/high- 
activity thermostable polymerase. Use a three-step program for the first five 
cycles, with annealing at 58 °C and 5 min extensions at 72 °C. For the 
remaining cycles perform a 2-step program, eliminating the annealing step. 
To minimize the chances of acquiring PCR generated mutations in the 
knock-in cassette, perform three independent PCR reactions for each target 
gene and pool the reactions following amplification. 

4.3. Transformation 

Directly transform ~20— 50 jA of the knock-in cassette PCR product 
(without purification) into the target strain using standard C. albicans trans- 
formation methods as described above. Following transformation, but prior 
to selection, wash the cells twice with YEPD then split to two independent 
5 ml cultures (to insure isolation of independent clones) and recover for at 
least 5 h at 30 °C. Pellet and plate the entire culture onto YEPD + 400 /ig/ml 
nourseothricin. Note: addition of adenine and/or uridine to the growth 
medium, even with prototrophic strains, can increase the efficiency of several 
steps in this protocol. 
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4.4. Integration confirmation 

Screen nourseothricin resistant colonies by colony PCR with the following 
primers: 

Upstream flank check: Use a primer that hybridizes ~500 bp upstream of the 
target gene stop codon (extending toward the stop codon) and AHO300. 
(CCGTTAATTAACCCGGGGATC). AHO300 anneals to the linker 
sequence, which is common to both pADH34 and pADH52, and 
extends into the tagged ORF. 

Downstream flank check: Use a primer that hybridizes ~500 bp downstream 
of the target gene stop codon (extending toward the stop codon) and 
AHO301 (GGAACTTCAGATCCACTAGTTCTAGAGC), which 
anneals to both pADH34 and pADH52. 



4.5. SAT1 marker excision 

To induce excision of the SAT1 /flipper cassette, culture nourseothricin 
resistant strains in YEP-maltose (2%) for at least 5 h (or overnight) at 30 °C 
and plate ~100 cells/plate on YEPD +25 /ig/ml nourseothricin. (Note 
that some mutant strains are hypersensitive to nourseothricin, and lower 
concentrations (<5 /ig/ml) may be necessary.) Following 1—2 days of 
growth at 30 °C, small, medium, and large colonies should be observed. 
Patch small and medium sized colonies on to YEPD + 400 /ig/ml nourseo- 
thricin and onto YEPD without selection to screen for nourseothricin 
sensitive colonies. To confirm excision of the SAT1/FLP cassette, perform 
colony PCR with either AHO302 (TCACTAGTGAATTCGCGCTC- 
GAG, for myc tagging with pADH34) or AHO405 (TAAATAAT- 
GAATTCGCGCTCGAG, for TAP tagging with pADH52) and the 
downstream flank check primer described above. 

4.6. Tag sequence confirmation 

To confirm that the target ORF and the epitope tag are free of mutations, 
perform colony PCR using a high fidelity polymerase and the following 
primers: AH0283 (GGCGGCCGCTCTAGAACTAGTGGATC, com- 
mon to both pADH34 and pADH52) and the upstream flank check primer 
designed above. AH0283 anneals to the 3' end of the residual SAT1/FLP 
cassette sequence (including the FRT) and extends toward the tagged gene. 
Following colony PCR amplification from the tagged strain, purify the 
PCR product and sequence with AH0283 as the sequencing primer. 

Note: The pADH34 myc tagging construct inserts 629 bp in place of the 
original stop codon, while the pADH52 TAP tagging construct inserts 
152 bp. To determine the expected size of the PCR product for 
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sequencing, add this number to the distance between the upstream flank 
check primer and the original stop codon of the target gene. 



4.7. Schematic of the 13 x myc tagging procedure 

Note: The following figures outline the myc tagging protocol, which uses 
pADH34. Refer to the text above for a description of the minor variations 
in this process that are specific to TAP tagging with pADH52. 
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5. C. albicans Chromatin Immunoprecipitation 

Chromatin immunoprecipitation (ChIP) procedures with C. albicans 
are comparable overall to those used with S. cerevisiae and mammalian 
cells, and the following protocol is based on standard ChIP methods (For 
example, see Lee et al., 2006). We have found, however, that the methods 
used for cell lysis and DNA shearing are critical for performing high- 
resolution genome-wide ChIP (ChlP-chip) experiments with C. albicans. 
The following protocol has been used successfully, with reproducible 
results, to perform high-resolution ChlP-chip experiments with planktonic 
cultures of C. albicans, Kluyveromyces lactis, and S. cerevisiae (Tuch et ah, 
2008). This protocol has also been used, with increased lysis times, to 
perform ChIP with C. albicans biofilms (Nobile et al, 2009) and with 
both yeast and mycelial forms of Histoplasma capsulatum (Nguyen and Sil 
2008; Webster and Sil, 2008). We also describe a rapid method for amplifi- 
cation of ChIP DNA samples and hybridization to high-density oligonu- 
cleotide tiling arrays. 

In the previous section, we described a method for C-terminal epitope 
tagging in C. albicans that can be used to rapidly tag genes of interest for 
ChIP. Although the use of affinity-purified polyclonal antibodies raised 
against a unique peptide within a protein of interest is arguably a less 
disruptive method of immunoprecipitation, there are several drawbacks to 
such an approach. These "peptide antibodies" are costly, take time to 
produce, and often require extensive optimization for ChIP experiments. 
To control for cross-reactivity, which is often a problem with peptide 
antibodies, a viable gene deletion strain is required as a negative control, 
making ChIP results with essential genes much more difficult to validate. 
Lastly, at least two different peptides from each protein of interest should be 
used to raise antibodies, as it is not unusual to have one or both sets of 
antibodies fail completely in ChIP experiments. C-terminal epitope tagging 
and immunoprecipitation with commercially available, high-specificity 
monoclonal antibodies offers a rapid, economical, and effective method to 
circumvent many of these problems. 



5.1. Chromatin immunoprecipitation protocol 

Step 1: Culture growth and cross-linking 

1. Grow 200-400 ml of cells to OD = 0.4 

Note: ~200 ml of OD = 0.4 is sufficient for one batch of lysate, 
which is sufficient material for as many as 10 individual ChlPs. 
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2. Add fresh formaldehyde (if previously opened, use within 1 month) to 
final concentration of 1% (stock is 37%) and cross-link 15 min at room 
temperature with occasional mixing. 

3. Quench cross-linking by adding 2.5 M glycine (make fresh) to a final 
concentration of 125 mM and incubate for 5 min at RT. 

4. Collect cells by centrifugation for 10 min at 1000x^ in a fixed- angle 
centrifuge rotor. 

5. Decant and resuspend pellets in 10 ml ice-cold TBS and transfer to 15 ml 
Falcon tubes, pellet, decant, and repeat wash 1 more time, then resus- 
pend pellet in 2 ml ice-cold TBS. Split cell suspension to two 2 ml 
Sarstaedt tubes, pellet, decant, and freeze pellets in liquid nitrogen. Store 
at — 80 °C or proceed to step 2 (skip freezing). 

Step 2: Cell lysis and immunoprecipitation 

1 . Thaw cell pellets on ice, weigh the pellet (tare scale w/empty tube) and 
resuspend pellets in 700 fA ice-cold lysis buffer with protease inhibitors 
(Add protease inhibitors immediately prior to use at the following final 
concentrations: 1 mM PMSF, 1 mM benzamadine, 1 /ig/ml each 
leupeptin, pepstatin, and bestatin; alternatively, Roche complete 
protease inhibitor cocktail (EDTA-free, catalog #11836170001) can 
be used; mix 1 Roche tablet with 10 ml lysis buffer). 

2. Transfer cell suspension to a fresh 1.75-ml microfuge tube filled to the 
500 fA mark with 0.5 mm glass beads. 

3. Place in an Eppendorf mixer (part #5432), clamped vortex genie, or 
equivalent for ~2 h at 4 °C. 

Note: Mixing times may vary, depending on cell type and growth 
conditions. For example, this technique has been used successfully with 
biofilms by extending the mixing time to > 4 h on a vortex mixer. 

4. Check cell lysis under the microscope and if >90% of cells are lysed, 
proceed to step 5. 

Note: Cells should appear as a mixture of "ghosts" and fragmented 
cell debris by phase contrast microscopy. 

5. Recover the lysate: Invert the tubes containing lysate/beads and wipe 
with 70% ethanol. Allow to dry then pierce the bottom of the tube 
with a 26-gauge needle. Open the tube and place it (right side up) into 
a 5-ml falcon tube and pierce the falcon tube (above the level of the 
bottom of the microfuge tube) with an 18-guage needle attached to a 
vacuum line. The lysate should flow through to the bottom of the 
falcon tube (alternate: recover by centrifugation into a larger tube). 
Recover the lysate and transfer 300 ml to each of two fresh 1.75 ml 
microfuge tubes (for Bioruptor shearing) or transfer entire lysate to one 
fresh 1.75 ml tube (for microtip sonication). 
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6. Shear chromatin by sonication in a Diagenode Bioruptor (15 min, 
high setting, 30 s on, 1 min off) or with a microtip sonicator (5 X 20 s at 
level 2, 100% duty cycle, with 1 min on ice between each pulse). 

Note: Shearing with a Bioruptor yields smaller fragment sizes, tigh- 
ter shear distribution, and greater consistency than the tip sonication 
method and is highly recommended for ChlP-chip applications. 

7. Pellet cell debris for 5 min at 14,000 rpm at 4 °C and transfer the 
supernatant (lysate) to a fresh tube. 

8. Remove 50 jA of the lysate and transfer to 200 jA TE/1% SDS. This is 
the "input DNA" sample which can be stored at — 20 °C until the end 
of step 3 when it is processed along with the immunoprecipitated 
DNA. 

9. Aliquot and dilute sheared lysate according to the number of IPs to be 
perfomed. For each IP, use 50—500 /A of crude lysate in 500 jA (final 
volume) lysis buffer (with fresh protease inhibitors). The relative 
amounts of lysate in each IP can be equalized between strains or 
samples by normalizing against the mass of each cell pellet. 

10. Add antibody (typically 5 fig of affinity-purified polyclonal antibody or 
2 /ig of monoclonal anti-myc antibody) and incubate overnight at 4 °C 
on a nutator. 

1 1 . The next day, add 50 /A of a 50% slurry of protein-A or protein-G 
Sepharose beads (washed two times with TBS and three times with lysis 
buffer) and incubate at least 2 h at 4 °C on a nutator. 

Step 3: Recovery of immunoprecipitated DNA 

1. Wash beads as follows: 

Pellet 1 min at 1000x^ and draw off the supernatant with an 18-gauge 

needle on a vacuum line. Wash with the buffers indicated below for 

5 min each with mixing on a nutator: 

2 X with 1 ml lysis buffer 

2x with 1 ml lysis buffer w/500 mM final NaCl 

2 X with 1 ml Wash buffer 

1 X with 1 ml TE 

Note: Although Wash buffer temperatures, incubation temperatures, 
and incubation times can all be optimized for each antibody, we have 
found that ice-cold buffers and 5 min incubations at room temperature 
work best for most antibodies. 

2. After the last wash, draw off TE and add 110 /A elution buffer, vortex, 
and incubate 10 min at 65 °C, mixing every 2 min. 

3. Pellet 30 s at 14,000 rpm at room temperature and remove 100 jA to a 
fresh tube. 

4. Add 150 jA TE + 0.65% SDS and vortex vigorously. Pellet and remove 
150 /A and pool with previous eluate (250 /A final). 

5. Incubate IP and "input DNA" samples (from step 2) for ~ 16 h at 65 °C. 
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Step 4: Cross-link reversal and DNA cleanup 

1 . Add 250 fA of proteinase K mix (for each sample: 238 fA TE, 2 fA 5 mg/ml 
glycogen, 10 fA 10 mg/ml proteinase K) and incubate 2 h at 37 °C. 

Note: Make a fresh proteinase K solution each time from lyophilized 
powder. 

2. Add 55 fA 4 M LiCl and 500 fA phenol: chloroform: isoamyl alcohol 
(25:24:1), pH 8.0. Vortex briefly, then spin 1 min at > 10,000X£ and 
remove 500 fA of the aqueous layer to a fresh tube. 

Note: AMRESCO Biotechnology Grade phenol: chloroformdsoamyl 
alcohol (code 0883-100 ml) has provided reliable performance in this 
protocol. 

3. Add 1 ml ice-cold 100% ethanol and incubate at — 20 °C overnight or at 
least 1 hat-80°C. 

4. Centrifuge 30 min at > 10,000 Xg at 4 °C and decant carefully with a 1-ml 
pipette. 

5. Wash pellet lx with 70% EtOH, spin 5—10 min, decant, spin briefly, 
and remove residual EtOH. 

6. Air dry the pellets and resuspend. Use 25 fA TE for IP samples and 100 fA 
TE + 100 fig/ ml RNaseA for input DNA samples. 

7. Incubate input DNA/RNaseA solution for 1 h at 37 °C, then store 
at-20°C. 

Note A: An optional DNA cleanup step could be performed on the 
input DNA following this step (i.e., a commercial DNA cleanup kit), 
however, this adds an additional variable (relative to the IP'd DNA) and 
may actually contribute to "spiky" data in ChlP-chip experiments. It is 
probably safest to leave the RNaseA in the input DNA sample and avoid 
any cleanup steps prior to amplification. 

Note B: Although chromatin shearing with the Bioruptor is highly 
reproducible (assuming cell lysis is >90%), it is advisable to monitor 
sheer distribution of the input DNA sample prior to proceeding with 
subsequent analysis of ChIP samples. Test the sheer distribution by 
running ~ 200—500 ng of purified input DNA (purify an aliquot with 
a DNA purification mini column) on a 2% agarose gel at ~ 5 V/cm. The 
average sheer size from the Bioruptor is typically ~200 bp, with most 
fragments distributed between 100 and 400 bp. 



5.2. Chromatin immunoprecipitation buffers 

Be sure to use autoclaved ddH 2 and baked glassware when making buffers 
to avoid DNA contamination. This caution is especially important for the 
final Wash buffers and post-elution steps. 

TBS: 20 mMTris/HCl (pH 7.5), 150 mMNaCl 



754 Aaron D. Hernday et qL 

Lysis buffer: 50 mMHEPES/KOH (pH 7.5), 140 mMNaCl, 1 mMEDTA, 

1% Triton X-100, 0.1% Na-deoxycholate 
Lysis buffer w/ 500 mM NaCh same as above, increase total NaCl concentra- 
tion to 500 mM 
Wash buffer: 10 mMTris/HCl (pH 8.0), 250 mMLiCl, 0.5% NP-40, 0.5% 

Na-deoxycholate, 1 mMEDTA 
Elution buffer: 50 mMTris/HCl (pH 8.0), 10 mMEDTA, 1% SDS 
TE/0.67% SDS: 10 mMTris/HCl (pH 8.0), 1 mMEDTA, 0.67% SDS 
TE/1% SDS: 10 mMTris/HCl (pH 8.0), 1 mMEDTA, 1% SDS 

4 M LiCl 

2.5 M glycine (fresh) 

10 mg/ml proteinase K in TE (fresh) 

5 mg/ml glycogen (in TE) 

5.3. Strand displacement amplification of ChIP samples 

The following protocol uses high concentration exo Klenow (New England 
Biolabs #M0212M) and random DNA nonamers to perform strand displace- 
ment amplification of the input and IP DNA samples from ChIP 
experiments. Prior to amplification, input and IP DNA concentrations are 
normalized by dilution of the input DNA for each corresponding IP based on 
the qPCR values for a nonenriched locus, such as the ADE2 ORF (primers 
AH0294: GTTGTCAGATCATTAGAAGGGGAAG and AH0295: 
AAGTATCTGGGATCCTGGCA). Input and IP samples are amplified 
separately, in parallel, and should yield similar amounts of product following 
each round of amplification. Typically, three rounds of amplification are 
required prior to dye coupling and hybridization of the ChIP samples. If 
the IP DNA concentration is sufficient, Round B amplification can be 
omitted. Since this is a nonspecific amplification, all DNA will be amplified 
by this approach; all amplification steps should be performed with clean 
gloves, filter tips, autoclaved ddH 2 and dedicated reagents which are free 
of any contaminating DNA. 

• Round A (primary amplification): 

1. Mix: 

- 12 jA of IP sample or diluted input (diluted in TE) 

Note: equalize the input and IP samples based on qPCR values 
for a nonenriched locus. 

- 12/ilH 2 

- 20/il2.5x SD A buffer 

2. Incubate 95 °C, 5 min then immediately transfer to an ice water bath 
for 5 min. 

3. Add 5 jA dNTP mix (1.25 mM each nucleotide). 

4. Add 1 iA 50 U//il exo" Klenow (NEB). 



Genetics and Molecular Biology in Candida albicans 755 

5. Incubate 2 h at 37 °C with heated lid in a thermal cycler. 

Note: If needed, let the reactions sit up to ~2 h at 10 °C following 
amplification or add 5 fA 0.5 MEDTA and store at —20 °C. 

6. Purify product with Zymo columns (Zymo Research): 

Add at least 3 volumes of binding buffer, bind, wash one time with 
200 fA binding buffer, two times with 200 fA Wash buffer, spin 1 min 
at 10,000 Xg to dry, and elute with 30 fA H 2 into a fresh tube. 

7. Check 1 .5 fA on a NanoDrop spectrophotometer (Thermo Scientific) . If 
> 400 ng total, skip to Round C. Otherwise continue with Round B. 

• Round B (secondary amplification): 

1. Mix: 

- 24 fA Round A DNA 

- 20/il2.5x SD A buffer 

2. Repeat steps 2—7 of Round A, but elute with 50 fA H 2 0. 

• Round C (amino ally I- dUTP incorporation and final amplification): 

Preferred approach: 

Perform 100 fA reactions with 1—2 fig total Round B DNA for each 
sample. Yields only ~2.5- to 3-fold amplification, but dye coupling 
is still very efficient. 

1. Mix: 

- 1-2 fig of Round B DNA + H 2 to 48 fA total 

- 40/il2.5x SDA 

2. Incubate 95 °C, 5 min then immediately transfer to an ice water bath 
for 5 min. 

3. Add 10 fA 1.25 mM aminoallyl-dNTP mix (1:10 dilution of stock 
solution). 

4. Add 2 fA 50 U/fA exo~ Klenow. 

5. Incubate 2 h at 37 °C with heated lid in a thermal cycler 

Note: If needed, let the reactions sit up to ~2 h at 10 °C following 
incubation or add 5 fA 0.5 MEDTA and store at —20 °C. 

6. Purify Round C product with Zymo columns: Add at least 3 
volumes of binding buffer, bind, wash one time with 200 fA binding 
buffer, two times with 200 fA Wash buffer, spin 1 min at 10,000 Xg to 
dry, and elute with 50 fA H 2 into a fresh tube. 

7. Check 1.5 fA on NanoDrop; the yield should be ~ 5 fig of total DNA 
per reaction. 

Alternate Round C approach: 

If Round B yields less than 1 fig total DNA, set up 2x 100 fA Round C 
reactions for each sample, using 200—400 ng of Round B DNA per 
tube. Perform amplification and cleanup as described for the preferred 
approach, but pool the two independent reactions prior to the Zymo 
column purification. 
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5.4. Strand displacement amplification solutions 

2.5 X SDA mix: (best if made fresh, but can be kept at —20 °C for up to 
1 month) 

- 125 mMTris-HCl, pH 7.0 

- 12.5mMMgCl 2 

- 25 mM BME 

- 750 /ig/ml random DNA nonamers (dN9) 

10 X amino allyl-dNTP stock solution: 

- 12.5mMdATP 

- 12.5mMdCTP 

- 12.5mMdGTP 

- 5 mM dTTP 

- 7.5mMaa-dUTP 



5.5. Dye coupling 

1. Speed- vac amplified input and IP reactions from Round C to <9 fA 
each, or until dry. 

2. Resuspend or QS to 9 fA with H 2 and add 1 fA of fresh 1 M Na 
bicarbonate, pH 9.0. 

Note: Prepare Na bicarbonate on the day of labeling and carefully pH 
using a pH meter. 

3. Immediately add 1.25 fA Cy3 (input sample) or Cy5 (IP sample) 

Note: We use Amersham monoreactive dye packs (Cat. #PA23001 and 
PA25001). Each tube contains sufficient die for eight labeling reactions. 
Resuspend the dye in 10 fA DMSO and use 1.25 fA of dye per labeling 
reaction. If fewer than eight reactions will be performed, either decrease the 
volume of DMSO to use the entire tube or aliquot and speed- vac the unused 
dye. Store any unused dye under desiccation at 4 °C, protected from light. 

4. Incubate labeling reactions for 1 h at room temperature in darkness. 

5. Purify dye-coupled DNA with Zymo columns (Zymo Research): 

Add 800 fA of Zymo DNA binding buffer to each sample and load 
onto a Zymo column. Wash one time with 200 fA DNA binding 
buffer, two times with 200 fA Wash solution, spin 1 min at 10,000 Xg 
to dry, then elute with 50 fA H 2 0. Check 1.5 fA on a NanoDrop 
spectrophotometer using the "microarray" setting to quantitate the 
total yield and dye-coupling efficiency (Typically > 20 picomoles of dye 
per microgram of DNA). 

6. Proceed to array hybridization, following the array manufacturer's 
guidelines. Equalize the input and IP samples to 5 fig each for a 
1 X 244 K format Agilent microarray. 
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Note: We have found Agilent custom oligonucleotide arrays, hybri- 
dization buffers, and Wash buffers to consistently yield high-quality data. 
The following hybridization protocol was adapted from the Agilent 
oligo aCGH/ChlP-on-Chip hybridization kit. 

5.6. ChlP-chip hybridization protocol (adapted from the 
Agilent oligo aCGH/chip-on-chip hybridization kit) 

This protocol is for competitive hybridization of amplified, dye-coupled 
ChIP and input DNA using Agilent 1 X 244 K format oligonucleotide tiling 
array. While a newer version of this protocol can be found on the Agilent 
website, we include this protocol and notes for convenience. Please follow 
manufacturer's guidelines for other array formats. 

1 . Mix 5 fig each (input and IP) sample and bring volume to 1 50 /A with H 2 0. 

Note: Less DNA can be used; as little as 1 fig each of input and IP 
samples have been successfully hybridized and scanned with no signifi- 
cant decrease in data quality. 

2. Add 50 fi\ of 1 mg/ml Human Cot-1 DNA (Invitrogen) . 

3. Add 50 jA of 10 X Agilent blocking agent. 

4. Add 250 /A of Agilent hybridization buffer. 

5. Mix thoroughly then quick-spin to collect. 

6. Incubate 3 min at 95 °C then transfer immediately to 37 °C for 30 min. 

7. Spin 1 min at full speed in microcentrifuge then carefully remove 490 jA, 
load onto gasket slide, cover with array slide, and assemble hybridization 
chamber. 

8. Hybridize for ~40 h at 65 °C in an Agilent microarray hybridization 
oven with the rotation speed set at "20." 

9. Disassemble the array and wash using Agilent Wash buffers. 

a. Incubate array 5 min with mixing in Agilent oligo aCGH/ChlP-on- 
Chip Wash Buffer 1 at 25 °C. 

b. Incubate 5 min with mixing in Agilent oligo aCGH/ChlP-on-Chip 
Wash Buffer 2 at 32 °C. 

c. Incubate 1 min with mixing in acetonitrile at 25 °C. 

d. Incubate 30 s with agitation in Agilent drying and stabilization solution. 

Note: To ensure even drying, very slowly remove the slide holder 
such that rsj 10 s elapse prior to complete removal from the solution. 
Note: For disassembly, hold the micro array/ gasket slide "sandwich" 
submerged in Wash buffer 1 while gently gripping sides of the micro- 
array slide. Carefully pry the gasket slide off of the array by inserting the 
tip of a plastic forceps between the outer edge of the two slides and 
lightly twisting the forceps. The gasket slide will fall away, while the 
array should remain in your hands. Be sure to avoid any contact with 
the printed array surface. 
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Abstract 

In this chapter we present basic protocols for the use of Schizosaccharomyces 
pombe, commonly known as fission yeast, in molecular biology and genetics 
research. Fission yeast is an increasingly popular model organism for the study 
of biological pathways because of its genetic tractability and as a model for 
metazoan biology. It provides an alternative and complimentary approach to 
Saccharomyces cerevisiae for addressing questions of cell biology, physiology, 
genetics, and genomics/proteomics. We include details and considerations for 
growing fission yeast, information on crosses and genetics, gene targeting and 
transformation, cell synchrony and analysis, and molecular biology protocols. 




1. Introduction 

Schizosaccharomyces pombe, or fission yeast, is an archaeascomycete that 
is evolutionarily remote from budding yeast (approximately 10 years sepa- 
ration) (Heckman et ah, 2001). While the genomes are similar in size 
(13.8 Mb for S. pombe, 12.1 Mb for Saccharomyces cerevisiae), they are neither 
related nor syntenic. Fission yeast is much less likely than budding yeast to 
have duplicated genes (Hughes and Friedman, 2003), although it is more 
likely to have introns (Wood et ah, 2002). The fission yeast genome is 
divided between just three chromosomes, with large and complex replica- 
tion origins and centromeres that are models for metazoan structures 
(Forsburg, 1999). Additionally, it shares some genes with humans that are 
missing from budding yeast (Aravind et ah, 2000), which makes it a com- 
plementary experimental system to budding yeast. 

S. pombe offers the traditional yeast strengths of excellent genetics and an 
easily manipulated genome. The tools for fission yeast are conceptually 
similar to those for budding yeast, although their biological details are 
specific to the S. pombe system. The fission yeast community is smaller 
than the budding yeast community, and has traditional strengths in cell 
growth and division, differentiation, DNA replication and repair, and 
chromosome dynamics. More recently, investigators have used S. pombe 
to study a variety of other cell biology problems including signal transduc- 
tion, RNA splicing, and cell morphology. An international fission yeast 
conference is held every 2—3 years, alternating between Europe, Japan, and 
North America. There is also a list-serv ("pombelist") and a number of 
Web-based resources (Wood and Bahler, 2002) (refer to Appendix). 

In this chapter we provide essential methods for S. pombe growth and 
manipulation suitable for a beginning lab. Other excellent reviews on 
S. pombe biology and methods may be found in Egel (2007), Forsburg 
(2003a), and Forsburg and Rhind (2006). 



Methods for Fission Yeast 761 




2. Biology, Growth, and Maintenance 
of Fission Yeast 

Fission yeasts are rod-shaped cells that divide by medial fission. 
Wild-type cells are typically 8—14 /im in length and 4 /im wide, and grow 
primarily in length, not in width. Diploids are proportionately larger in all 
dimensions. The cell cycle is divided into distinct Gl (10%), S (10%), G2 
(70%), and M (10%) phases. In exponentially growing wild-type cells, the 
nuclear division cycle is staggered relative to cell division and newly divided 
nuclei enter the next cell cycle and undergo Gl and S phase prior to 
cytokinesis. Thus, upon completion of cell division, the newborn cell is 
already in G2 phase. For this reason, a single cell particle almost always has a 
2C DNA content, either because of the extended G2 phase, or the binu- 
cleate Gl or S phase cells. Because of the strict control of cell size, overall 
length is an effective metric for the position in the cell cycle. 

The essentials of S. pombe culture are identical to those used for 
S. cerevisiae; they differ in growth rate and media selection. Cultures are 
usually initiated on rich medium and replica plated to test markers and 
ploidy. Media formulations are detailed in Tables 32.1 and 32.2. Yeast- 
extract medium, YE or YES (+ supplements), is rich but poorly defined, 
and is used for initial growth and/or recovery. Synthetic and well-defined 
media such as Edinburgh minimal media 2 (EMM2) or its derivative, 
pombe minimal glutamate (PMG) is used in experiments requiring stable 
physiological conditions or maintenance of auxotrophic markers. Appro- 
priate supplements are added as needed, and single marker dropouts used to 
determine auxotrophy. Cells will grow on S. cerevisiae media, but S. pombe 
media is optimized for this species, and is recommended. Mating media are 
described in Table 32.2. 



2.1. Other media supplements— Phloxin B 

Phloxin B (PB; Magdala Red, Sigma P4030) is a vital stain used in agar 
plates to identify compromised cells. Healthy cells are efficient at eliminat- 
ing PB, making pale pink colonies. Sick cells retain more PB, leading to a 
dark pink or red color. This is useful for rapid screening of replica plates, 
for example, screening for drug sensitivity, isolating temperature sensitive 
mutations, or identifying auxotrophs on medium lacking supplements. 
Additionally, diploid cells are also less efficient at removing PB and thus 
diploid colonies are darker pink than haploid. Assessment is based upon 
visual inspection of colonies, as the color differences are not apparent in 
single cells. It is useful to have a known, healthy haploid streaked alongside 
test strains for comparison. PB is mildly toxic and is only used for short-term 



Table 32.1 Growth media for fission yeast, Schizosaccharomyces pombe 



On 



Name Use 


Recipe 


Concentration 


Notes 


Rich media: Poorly defined because of variations in yeast < 


extract and preparation, this media supports vig 


orous growth with 


added glucose. YE is a naturally low-adenine medium. 


Additional supplements are 


added to maximize growth rate 


Yeast extract (YE) Vegetative growth, 


5 g/1 yeast extract 


0.5% (w/v) 


• Inhibits conjugation and 


thiamine repression" 2 


30 g/1 glucose 


3.0% (w/v) 


sporulation 


Yeast extract + supplements Vegetative growth; 


YE base plus: 




• Medium of choice for most 


(YES) ,c maintaining diploids 


225 mg/1 each of adenine. 




nonphysiological growth; 




L-histidine, L-leucine, 




inhibits conjugation and 




uracil, and L-lysine 




sporulation 


YES + Phloxin B • Screening diploids 


YES plus: 




•Make 2000 x PB stock as 


• Temperature or drug 


5 mg/1 phloxin B 




10 g/1 in water and filter 


sensitivity detection 


(Sigma, P4030) 




sterilize. Store at room 
temperature in the dark 
• Store plates in dark, room 
temperature, up to 
1 month. Discard upon 
color loss or browning 


Synthetic media: All these media are derived from the buffered defined media called EMM. Variations in 


carbon source, nitrogen 


source, and additional supplements may be required for different uses. EMM-glutamate, called PMG, 


is used as a standard 


medium in some labs 








Edinburgh minimal medium Vegetative growth 


3 g/1 potassium hydrogen 


14.7 mM 


• Defined growth medium 


2 (EMM) 6 


pthallate 


15.5 mM 


• Add supplements (ade, arg, 




2.2 g/1 Na 2 HP0 4 


93.5 mM 


his, leu, lys, ura) from stock 




5 g/1 NH 4 C1 


111 mM 


solutions, described below 




20 g/1 glucose 








20 ml/1 salts (50 X stock) 








1 ml/1 vitamins 








(1000 x stock) 








0.1 ml/1 minerals 








(10,000x) 







ON 



50 x salt stock 










52.2 g/1 MgCl 2 -6H 2 
0.735 g/lCaCl 2 -2H 2 
50 g/1 KC1 

2 g/1 Na 2 S0 4 


0.26 M 
4.99 mM 
0.67 M 
14.1 mM 


• Made in water and 

autoclaved. Store stock at 
4 °C indefinitely 


lOOOx vitamin stock 










1 g/1 pantothenic acid 
10 g/1 nicotinic acid 
10 g/1 inositol 
10 mg/1 biotin 


4.2 mM 
81.2 mM 
55.5 mM 
40.8 \xM 


• Made in water and 

autoclaved. Store stock at 
4 °C indefinitely 


10,000 X mineral stock 










5 g/1 boric acid 

4 g/1 MnS0 4 

4g/lZnS0 4 -7H 2 

2 g/1 FeCl 2 - 6H 2 

1 g/1 KI 

0.4 g/1 molybdic acid 

0.4g/lCuSO 4 -5H 2 O 

10 g/1 citric acid 


80.9 mM 
23.7 mM 
13.9 mM 
7.40 mM 
6.02 mM 
2.47 mM 
1.60 mM 
47.6 mM 


• Made in water and 

autoclaved. Store stock at 
4 °C indefinitely 


Supplements/ stock 


For 


auxotrc 


>phic 


; markers 


7.5 g/1 adenine, histidine, 


50-225 mg/1 


• Made in water and 


solutions (EMM + 










leucine, lysine, arginine 




autoclaved. Add to EMM 


supplements) 










3.75 g/1 uracil 




or PMG as required' 
• Heat uracil gently to 
solubilize 


Minimal low adenine^ 


Screening a 


de6~ 


alleles 


EMM + required 




• Adenine concentrations 












supplements 




from 7.5 to 30 mg/1 may be 












+ 7.5 mg/1 adenine 




used 


Minimal — N 


Gl 


arrest at 


25 c 


"C 


As for EMM, but omit 
NH 4 C1 




• Starvation medium used for 
Gl arrest 


Minimal - N - G 


G2 


arrest at 


25 c 


D C; storing 


As for EMM, but omit 




• Starvation medium, used to 




spores 






NH 4 C1 and glucose 




arrest diploids in G2 



{continued) 



Table 32.1 (continued) 



ON 



Name 


Use 


Recipe 


Concentration 


Notes 


Minimal + thiamine 


nmtl promoter repression"* 


As for EMM, add: 5 fig/ ml 
thiamine before use 


15 /iM thiamine 


• 2000 X thiamine stock 

solution is 10 mg/ml in 
water. Filter sterilize and 
store in the dark, room 
temperature 

• Thiamine can be titrated 

from 0.01 to 20 fiM to 
regulate expression (Javerzat 
et al, 1996) 


Pombe minimal glutamate 


Vegetative growth, and 


As for EMM, but omit 


22.1 mM 


• More even growth between 


(PMG)* 


screening 


NH4CI and add: 3.75 g/1 




Ura+/— strains 




heterochromatic 


L-glutamic acid, 




• Compatible with G418 




markers 


monosodium salt (Sigma 
G 5889) 




selection 
• Add supplements as 
appropriate^ 



Media are prepared in bulk batches, and autoclaved at 121 °C, 30.5 psi for 20 min to avoid overcaramelizing sugars. Some fission yeast media are commercially available, and require 

only mixing powder and water. 

Solid media are made by adding 2% Difco Bacto Agar. 

When transforming a construct using the nmtl promoter, we recommend inclusion of thiamine in growth and maintenance steps to prevent toxicity from gene overexpression. This 

has an added benefit of promoting growth. 

Drug plates are typically made with YES to eliminate limiting nutrient effects. When made in EMM, the drug concentration is different and often higher than the effective dose on 

YES. The drug G418 is most often used in YES, and never in EMM. However, G418 resistance can be studied selectively using PMG. 

Rich medium (YE/YES) contains thiamine, as does SD medium from budding yeast, and cannot be used for high-level protein expression in 5. pombe under the nmtl promoter. 

Phloxin B and thiamine are always added to cool (60 °C) molten agar. 

Supplements are typically added once agar has cooled to 60 °C. Dropout media, in which all possible supplements are added except one, is not widely used in fission yeast, simply 

because of habit. 
- Anecdotal evidence suggests that lower supplement concentrations are required in PMG medium, typically 75 mg/1. This is likely due to more efficient uptake of supplements in the 

glutamate medium than in EMM. 
g Low adenine medium allows the development of red color in Ade— strains, and can be used to help discriminate ade6 alleles visually. PMG may require a lower amount of adenine for 

color discrimination (<7.5 mg/1). 

PMG may cause unwanted sporulation in some diploid strains. YES is the first choice for growing diploid cultures, followed by EMM for physiological experiments. PMG is 

preferred when screening heterochromatic silencing markers or G418 selection. Some labs use PMG exclusively. 



Table 32.2 Media for conjugation and sporulation 



Name 


Recipe 


Concentration 


Notes 


Malt extract (ME) 


30 g/1 bacto-malt extract 225 mg/1 each: 


3% (w/v) 


• Complex nitrogen-limiting 




adenine, histidine, leucine, and uracil 




medium; not well defined 




Adjust to pH 5.5 with NaOH 




• Contains thiamine 

• Some batch-to-batch variation 


Sporulation agar with 


10 g/1 glucose 


1% (w/v) 


• Stringent synthetic mating 


supplements 


1 g/1 KH 2 P0 4 


7.3 mM 


medium 


(SPAS) 


1 ml/1 lOOOx vitamin stock 




• Benefit: consistency batch-to- 




45 mg/1 each: adenine, histidine, leucine, 


1/5 normal 


batch 




uracil, and lysine hydrochloride 


concentrations 


• Drawback: may produce 
inefficient mating in sickly 
strains 



Solid media is made by adding 2% agar. Autoclave media for 15—20 min, 121 °C, 30 psi. 



ON 
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testing and not long-term storage. Cells on PB plates will die rapidly if 
refrigerated, and colonies should be used or transferred to YES agar within a 
few days of plating on YES + PB. PB plates are light sensitive and should be 
stored in the dark. 



2.2. Other media supplements— Adenine and low-Ade media 

The ade6 gene, the ortholog of ADE2 in budding yeast, provides a 
colorimetric system for plasmid and chromosome maintenance. On condi- 
tions of low adenine (Table 32.1), cells induce the adenine biosynthetic 
pathway and ade6 mutants accumulate a red/pink intermediate. Growth on 
full levels of adenine represses the pathway and colonies remain white. 

2.3. Storage of fission yeast 

Frozen glycerol stocks are made by mixing 1 volume of a late-logarithmic 
YES culture with 1 volume of sterile 50% glycerol (v/v in YES or water) in 
a cryovial, and freezing at — 80 °C. Once frozen, strains can be maintained 
at —80 °C indefinitely, although there is some loss of viability over long 
periods (10 years or more). 

2.4. Growth in liquid media 

A wild- type culture in liquid YES at the optimal temperature of 32 °C has a 
doubling time of 2-3 h (3-4 h at 25 °C); in EMM/PMG this is slightly 
longer, usually by 0.5—1 h at 32 °C (+1 h or more at 25 °C). A starter 
culture is made in a small volume of liquid YES and grown overnight at an 
appropriate temperature. The starter culture is diluted into selective 
medium to obtain a physiologically reproducible population for study, 
using Eq. (32.1): 

vole X OD^es time 

vokr = 1 — 7Y and n — (32.1) 

OD sc x 2("" 1 ) to K ' 

where vol sc is the volume of starter used to inoculate, vol c is the volume of 
the overnight culture, OD des is the desired OD 600 of the culture at a given 
time, OD sc is the measured OD 600 for the starter culture at the time of 
inoculation, and £ D is the doubling time for a given strain. If the starter 
culture is late-log or stationary 2" n ~ ' is used, but if the culture is in mid- 
exponential growth this may be changed to 2 n as the first-generation "lag 
time" will not be as pronounced. 

Growth rates can be measured by charting OD 600 over time in an 
exponentially growing culture. Growth curves are particularly important 
for conditional mutants such as temperature sensitive strains, in which the 



Methods for Fission Yeast 



767 



cells are shifted to the restrictive temperature to determine how quickly 
they stop growing. For most temperature sensitive mutants, temperature 
arrest will occur within 1—2 generations (4—6 h). However OD 600 measures 
cell mass, which may not be the same as cell number in cell cycle mutants 
that continue growth without dividing. For these phenotypes, accurate cell 
numbers are determined using a hemocytomer or a Coulter Counter. 




3. Genetics and Physiology 



The 5. pombe life cycle is predominantly haploid (Fig. 32.1), and in 
contrast to budding yeast, fission yeast cells only reluctantly make diploids 
when starved for nitrogen. In normal conditions these diploids are transient, 
and the zygotes immediately proceed through meiosis and sporulation. 
Diploids can be recovered in the laboratory by selection for complementing 
markers and maintained through vegetative growth, but they are unstable 
and prone to sporulate (see Section 3.6). Azygotic asci produced from 



GO-phase 
CD 



Cg-D 






Conjugation j I 
/M-phase_ A \^>4P^ — 'Meiosis (? 




V 



, S-phase C ^ ) 



Gl-phase\ / 

jjK \ / 



G2-phase £ " G1 . ph 




(3D 



Haploid, vegetative growth 



Zygotic 
ascus 

Azygotic 
ascus 



Diploid, vegetative growth 



Figure 32.1 Life cycle of Schizosaccharomyces pombe. The fission yeast life cycle is 
predominantly haploid, and divided into distinct phases: Gl (10%), S (10%), G2 (70%), 
and M (10%). The nuclear division cycle is offset relative to cell division such that newly 
divided nuclei undergo Gl and S-phase prior to cytokinesis. Consequently, septation 
index is a metric for the proportion of S-phase cells in a population. Cell length is 
regulated, and increases throughout G2, and is an indicator of cell cycle position. S. 
pombe cells can be induced to enter a quiescent (GO) state by nitrogen starvation. 
Starvation also induces expression of mating-type factors that permit conjugation in the 
presence of a partner of the opposite mating type. Meiosis is quickly initiated in the 
transient diploid, producing zigzag- or banana-shaped zygotic asci. However, diploid 
cells may be trapped through complementing markers and induced to enter a diploid 
vegetative cycle by maintenance on rich medium and at higher temperatures. Upon 
starvation, diploid cells will quickly complete meiosis and sporulate as linear azygotic asci. 
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diploids are small and linear. Zygotic asci from coupled mating— sporulation 
are banana or zigzag-shaped. 

Fission yeast are induced to mate by nitrogen starvation, which 
synchronizes cells in Gl and induces expression of P (Plus, h ) or 
M (Minus, h~) information from the matl locus (Beach et ah, 1982; Wilier 
et ah, 1995). Homothallic wild-type strains (h ) are capable of switching 
mating type between h and h~ every second generation with information 
from silent loci mating loci. The mechanism of switching is conceptually 
similar, but molecularly distinct from that in budding yeast (Klar, 1992). 
Approximately half of the cells in an h population will be h and half will 
be h~ at a given time. For convenience, common lab strains are generally 
mating-type stable, or heterothallic, and require a partner of the opposite 
mating type for conjugation. While most h~ lab strains have lost P-informa- 
tion, h lab strains retain both M- and P-information in a rearranged config- 
uration and are capable of switching at a low frequency in the population 
from h + to h 90 (< 10" 3 per generation) (Klar, 1992; Klar et ah, 1991). 

These differences in mating and sporulation make fission yeast genetics 
different in practice than budding yeast. Because mating and meiosis are 
coupled, generally it is not necessary to isolate diploids first; simply cross two 
strains and let them proceed to sporulation. However, this also means that 
complementation tests are less convenient than linkage analysis. Some effort 
is required to maintain diploids, which are always primed to sporulate. 
Additionally, haploid-specific genes are not shut off in diploids, meaning 
that diploids express both mating types and can mate with each other to 
form tetraploids at low but discernable frequencies. 

Mating and sporulation are also intrinsically temperature sensitive. Effi- 
cient mating requires that cells arrest in the Gl phase of the cell cycle under 
nitrogen starvation; however, efficiency of Gl arrest declines with increas- 
ing temperatures with a higher fraction of cells arresting in G2. Additionally, 
sporulation is inhibited by high temperature, so maintaining a diploid at 
temperatures >30 °C helps reduce unwanted sporulation. Thus, mating 
and sporulation should optimally be performed at 25 °C and no higher than 
29 °C. 



3.1. Performing genetic crosses 

1 . Take a single colony of one strain and smear into a small patch onto an 
ME- or SPAS-agar plate. With a fresh sterile toothpick, take a single 
colony of the second strain of opposite mating type, and smear alongside 
the first. 

2. Place 10 jA of sterile water onto the yeast and mix with a fresh sterile 
toothpick. Allow the patch to dry briefly before inverting and incubating 
at 25 °C for 2—3 days. 
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3. Confirm mating microscopically; strains that have mated successfully 
will produce banana-shaped or zigzag zygotes and asci, and will stain 
positively with iodine (Section 3.2). 



3.2. Testing mating type and sporulation by iodine staining 

Successful conjugation and sporulation may be assessed by exposing a 
mating mixture on plates to iodine vapor, which stains the starch in the 
ascus walls but not the vegetative cells. However, iodine kills both vegeta- 
tive cells and spores, so it must be performed on a duplicate plate, and any 
further analysis performed on the strains from the untreated plate. 

1. Patch strains to be tested onto YES in a line. Include h and h~ controls. 
On a separate YES plate, patch a long line each of h and h~ control 
(Fig. 32.2). Grow plates for 2—3 days. 




Replica plate 

onto 

ME/SPAS 



Known mating 
type strains 



Strains to test 
(control h + , h~ at bottom) 




Spores, 
L positive 



No mating, 
I 2 negative 



Figure 32.2 Mating-type testing by replica plating. To screen large numbers 
of unknown strains, a plate of known mating-type strains (one h + , one h~) is made 
and grown on YES (A). At the same time, a plate with unknowns plus a control strain 
of each mating type is made with lines of culture across the plate, and grown on YES 
or suitable medium (B). Once the plates have grown, they are replica plated onto ME or 
SPAS agar, and incubated for 2-3 days at 25 °C. The lines from each plate are 
perpendicular to each other to create intersecting areas with both known mating-type 
streaks (C). When exposed to iodine vapor, intersections with sporulated yeast will 
turn black (shown for the control strains); in the case of a h strain, the parent will also 
mate and stain black (line labeled h ). Since mating may be poor or incomplete, 
patches should be examined microscopically for the presence of zygotes and asci (left 
box). Unmated cells will be round and starved (right). If replica plated onto two ME 
plates, one plate is used to iodine stain cells (causing death), while the second is used to 
recover spores. 
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2. Replica plate the control h and h~ plate onto a ME-agar plate. Change 
filter. Replica plate the strains for testing onto the same ME-agar plate 
perpendicularly to the control lines (Fig. 32.2). Incubate plates at 25 °C 
for 2—3 days. 

3. Invert the agar plate over a dish of iodine crystals (Sigma 13380) for 
5—10 min. Sporulated patches will stain black, fading to purple with 
time. Un-sporulated yeast will be orange/cream colored. Strains with 
switching or mating defects may show a gray, or speckled phenotype. 
The iodine crystals can be "capped" with an empty plate, and stored in a 
fume hood for repeated use, adding fresh crystals as needed. Iodine is 
toxic and volatile, and should only be used in a fume hood. 



3.3. Random spore analysis 

Random spore analysis (RSA) is the simplest method to analyze a large 
number of sporulation products. Asci from the cross are incubated in a 
dilute solution of glusulase (snail gut enzyme) to digest the ascus wall and 
liberate spores, while killing any vegetative cells that did not mate. This is 
highly efficient in fission yeast and makes RSA a practical option for many 
strain constructions, reducing the need for more time-consuming tetrad 
analysis. Its success depends on the ability to unambiguously identify double 
mutants, and on screening sufficient numbers for robust statistics to deter- 
mine appropriate segregation ratios. When in doubt, pull tetrads. 

1. Prepare 1 ml of 0.5% glusulase solution (PerkinElmer #NEE154001) in 
water for each cross. With a sterile toothpick, scoop up some of the 
mating patch from the mating plate, and resuspend in the enzyme 
solution. Vortex well and incubate overnight at 25 °C. 

2. Count spore concentration in each cross using a hemocytometer. Spores 
are small, round, and highly refractive, while vegetative cells will have a 
typical S. pombe rod shape and appear darker or fuzzy from the glusulase 
treatment. Plate no more than 500 spores onto a YES plate. Remaining 
spores can be washed in water or EMM — N — G for up to a month 
(survival is better in EMM — N — G). 

Grow spores at 25—36 °C for 4—6 days, then replica plate to test markers 
and determine candidates for analysis. 



3.4. Tetrad dissection 

Tetrad analysis is performed similarly to S. cerevisiae with only a few 
S. pombe-specific considerations. Because the 5. pombe ascus breaks down 
very easily, no pretreatment with glusulase is required; however, tetrads 
must be dissected within a day or two of mating. Cells from a mating 
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mixture are spread onto a YES plate and inspected. Ripe asci with refractive 
spores are identified and moved to a unique position on the plate using a 
micromanipulator. Plates are incubated until asci degrade, releasing the 
spores. This may be carried out at room temperature overnight, or as high 
as 36 °C (depending on the strain and time requirements) to accelerate ascus 
breakdown. Upon inspection, the four spores will have popped out of the 
ascus and can be easily dissected into appropriate positions on the plate. 
Following colony formation, the tetrad plate may be replica plated to score 
segregation of different markers. 



3.5. Bulk spore germination 

For lethal disruptions that have no conditional alleles, a convenient way to 
assess phenotype is by a mass spore germination experiment where spores 
from the disruption diploid are inoculated in liquid culture under condi- 
tions where only the disrupted spores with the selectable marker can 
germinate. Their phenotype can then be monitored in a timecourse using 
morphological analysis, flow cytometry, protein characterization, or other 
methods. 

1. Inoculate a sporulation-competent diploid colony in 10 ml YES and 
grow to OD 600 of 0.8—0.9, shaking at 32 °C (25 °C if temperature 
sensitive) . 

2. Inoculate 100 ml of liquid ME with 10 ml of the culture from step 1, and 
shake at 25 °C for 3 days. Check microscopically for sporulation. 

3. Harvest cells in a sterile tube. Centrifuge 500 Xg, 5 min and decant. 
Resuspend the cells in 10 ml of 2% glusulase (v/v in water), and shake at 
25 °C, 150 rpm overnight. 

4. Check microscopically that asci and vegetative cells have been digested. 
Harvest cells in a sterile tube, and centrifuge as above. Wash pelleted 
cells five times with 15 ml of EMM — N — G, and resuspend in 5 ml 
EMM - N - G. 

5. Prepare 40 ml of sterile 25% glycerol (v/v in water) in 50 ml conical 
tubes. Overlay the spores. Centrifuge 10 min, 2000 Xg, room tempera- 
ture and then carefully remove supernatant. (This step removes cell 
debris from digestion.) 

6. Wash pellet three times with 15 ml EMM — N — G. Resuspend pellet 
in 10 ml EMM — N — G, and examine. Repeat step 5 to further clean 
spores if necessary. Store at 4 °C. 

7. Plate a small amount onto YES to test markers. Inoculate spores into 
minimal selective media and sample hourly time points for germination 
and DNA replication. At 32 °C, S phase occurs between approximately 
6 and 8 h. 
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3.6. Isolating diploids 

Some situations require isolating and propagating vegetative diploids. For 
example, outcrossing a homothallic h strain (which would otherwise mate 
with itself), performing synchronous meiosis experiments, or any other 
experiment requiring the presence of homologous chromosomes depends 
upon diploid isolation. Diploids are isolated by selecting for complementing 
markers, and are maintained during growth on YES. It is important to 
remember that diploids are de-repressed for sporulation, forming azygotic 
asci upon starvation that may occur even within a single colony. The 
banana-shaped zygotic asci from mating cells may be distinguished mor- 
phologically from the linear azygotic asci produced by sporulating diploids. 
Diploids maintained for a prolonged time are prone to accumulate muta- 
tions that block meiosis and sporulation. For this reason, diploids are 
generally made freshly when needed and not stored for extended periods. 

The intragenic complementing markers ade6-M210 and ade6-M216 are 
an ideal tool for diploid selection. These mutations confer an Ade pheno- 
type if present in the same (diploid) cell, while individually in haploids they 
are Ade - . Using these as the complementing markers of choice avoids 
potential complications from recombinant haploid offspring during strain 
isolation. If these are not available, any pair of complementing markers may 
be used, but care must be taken to ensure that the selected strain is a diploid 
and not a recombinant haploid offspring. 

Importantly, because diploids are de-repressed for sexual differentiation, 
any conditions that lead to nitrogen limitation can lead to unwanted 
sporulation. For example, colonies on minimal media have likely sporulated 
in the middle. Thus final screening of isolated diploids on YES medium is 
essential. High temperatures (>30 °C) also suppress the sporulation 
response of diploids. 

1. Mate strains with complementing markers on ME/SPAS agar (as in 
Section 3.1). 

2. Remove a patch of the mating mixture and streak on selective media at 
intervals following mating (typically 16 and 26 h to start). Incubate at 
32 °C for 4—5 days (unless temperature sensitive) until small colonies 
form. These colonies will have sporulated in the middle, so actual 
diploid cells must be further purified. 

3. Streak several colonies onto YES + PB, including a haploid control for 
comparison. Diploid cells are longer and wider than haploids, and stain 
dark pink on PB. Identify appropriate colonies and streak on YES. They 
can be stored for no longer than 3—4 weeks at 4 °C. It is recommended 
to make diploids fresh when needed if possible. Prolonged storage even 
at —80 °C leads to loss of sporulation capacity. Before use, verify 
sporulation ability by streaking to single colonies on YES and replica 
plating to ME for iodine screening. Diploids will sporulate fully within 
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12 h of plating on ME/SPAS media. Ploidy may be confirmed by 
sporulation competence or by FACS. 
4. On rare occasions, a vegetative haploid will skip mitosis and form a stable 
homozygous diploid, a process called endo-reduplication. These strains 
may be identified by phloxin B staining, examination of cell size, and 
flow cytometry to confirm DNA content. Such diploids are unable to 
sporulate as they are mating-type homozygous and will behave like a 
haploid. However, expression of the missing mating-type information 
from an ectopic plasmid can induce sporulation, allowing some recovery 
of haploid offspring (Wilier et ah, 1995). Haploidization can also be 
induced by treating with drugs that promote chromosome loss such as 
mFPA (Kohli et ah, 1977); because aneuploidy is not tolerated in fission 
yeast, loss of a single chromosome in a diploid is accompanied by rapid 
loss of the remaining chromosomes to reduce the complement to 
haploid. 



3.7. Cell synchrony in fission yeast cultures 

In an asynchronous, exponentially growing wild-type fission yeast culture, 
approximately 70% of cells will be in G2 phase at any given time with the 
remaining 30% equally divided between Gl, S, and M phases. Cytokinesis 
occurs after the peak of S phase. The historically favored method to isolate a 
synchronous population is elutriation, which requires an elutriation rotor 
and specially modified centrifuge (e.g., Beckman J26XPI, product 
#393134) capable of differential sedimentation and regulating media flow 
during separation and collection (Walker, 1999). Fortunately, fission yeast 
cultures can also be synchronized by growth arrest, cell cycle mutations, or 
size selection, which do not require specialized equipment. Using media 
lacking nitrogen or glucose, or adding hydroxyurea, takes advantage of 
fission yeast metabolism to synchronize populations in Gl, G2, or S phases, 
respectively. Cell cycle mutant alleles use temperature shift and are tight 
arrests. Finally, an alternative size-based separation method using lactose 
gradient can separate a G2 population without special equipment. 
To confirm synchrony and monitor progression, samples can be monitored 
in real time for septation index, or fixed in ethanol for later analysis by 
morphology or flow cytometry. Septation index is obtained by counting the 
number of cells with a septum that have not yet invaginated. Septa may be 
viewed under phase microscopy or by staining with calcofluor (see 
Section 5.4). Since septation occurs concurrently with the next S phase, 
septation index is a measure of the next cell cycle, as well as an indicator of 
degree of synchrony, ideally 60—80% in the population for an effective 
block and release experiment. 
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3.7.1. S-phase arrest by hydroxyurea block and release 

Hydroxyurea (HU) depletes ribonucleotide reductase, causing nucleotide 
depletion and S-phase stalling. HU treatment has the potential to disturb 
cells with checkpoint or replication recovery defects, or alter S-phase 
events. We find that filtration is the most efficient and most gentle way to 
wash cells, but low-speed centrifugation may also be used to remove HU 
(500 Xg, 5 min). 

1. Grow a culture in minimal medium to mid-log phase (OD 600 0.3—0.6, 
approximately 1—2x10 cells/ml). 

2. Add HU to 12 mM, and shake for 4 h at 25-32 °C. HU 
(FW = 76.05 g/mol) stock solution is made up freshly at 1 Min water 
and filter sterilized. 

Note: 10-15 mM HU will arrest cells; 20-25 mM HU will cause an 
irreversible arrest and significant toxicity to wild-type cells. HU will also 
cause cell elongation over prolonged incubation. 

3. Set up a vacuum filtration unit, with a fresh, sterile glass fiber filter 
(Whatman #18220555). Turn on vacuum line, and pour culture into 
filter bell. Once the liquid has passed through leaving cells on the filter 
paper, wash twice with 0.5 culture volume of fresh EMM. 

4. Remove filter, and scrape cells into 1 volume of fresh EMM + supple- 
ments. Keep as sterile as possible. This will yield approximately 50% 
synchrony for the next S phase. 



3.7.2. Lactose gradient centrifugation 

Centrifuging cultures through a lactose gradient separates the cells based on 
volume/size. This allows enrichment for the smallest fission yeast cells, 
which are "newborn" G2 cells immediately after cytokinesis, and avoids 
cell cycle disruption (Carr et al, 1995). Because it relies on cell size, only 
strains with uniform morphology can be synchronized in this way; strains 
with irregular shapes or elongation phenotypes are not good candidates for 
this method. 

1. Grow a 100-ml culture in minimal media to mid-log phase, OD 600 of 
0.5—1.0. Harvest 100 ODs of cells and resuspend in 3 ml media. 

2. Prepare 45 ml of 10—40% lactose gradient (in media) using a gradient 
maker. 

Alternatively, 45 ml of 20% lactose can be frozen at — 80 °C for 4 h, and 
then thawed without disturbing for 3 h at 30 °C, spontaneously generating a 
10—30% gradient. The 40% solution must be warmed to solubilize the 
lactose, and both 40% and 10% solutions are made in YES or EMM, 
autoclaved, and stored at room temperature. 
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3. Layer the cells on top of the lactose gradient. 

4. Centrifuge at 500 Xg, 5 min in a swinging bucket rotor. Remove 2 ml 
fractions from the top of the gradient, and examine for uniformly small 
cells (usually in fractions 1—5). 

5. Pool fractions and centrifuge 500 Xg, 5 min to pellet cells. Wash cells 
once with fresh medium and then resuspend in 0.2 volumes of medium. 

This method isolates newly divided G2 cells, representing about 5—10% 
of the total culture amount (fewer than 1% should be septated). If desired, 
a second (10 ml) gradient may be used. 



3.7.3. Block and release using temperature sensitive alleles 

The key to temperature shift is rapid increase and decrease of temperature. 
Standard blocks for S. pombe are cdclO (Gl /START), cdc25 (G2 immedi- 
ately before mitosis), and nuc2 (M), by temperature shift to 36 °C for 4 h, 
followed by rapid temperature shift in an ice-water bath to 25 °C for release. 
Incubation at higher temperatures, or for longer than 4 h, will cause cell 
death. The cdclO-V50 allele causes cells to accumulate in Gl, and enter into 
S phase ~30 min after release, completing replication approximately 90 min 
post release as assessed by flow cytometry. The cdc25-22 allele is tightly 
temperature sensitive and delivers a G2-arrested population that releases 
quickly and synchronously (~90%) into the next S phase. Cells grow very 
long during the cdc25-22 arrest as cell volume increases in G2 and flow 
cytometry for DNA content using whole cells may be variable due to the 
variability in cell size and shape. Therefore, septation index is critical to 
assess block and release kinetics in a cdc25-22 block and release. 

1. Grow cultures in EMM with supplements to early/mid-log phase 
(OD 600 of 0.3— 0.5, approximately 1x10 cells/ml). Sample the asyn- 
chronous culture for comparison. 

2. Temperature shift the cultures to 36 °C in a prewarmed water bath and 
shake at 225 rpm, 36 °C for 4 h. At the end of the incubation, sample the 
synchronized culture (t = 0). 

3. Shift back to 25 °C by swirling flasks in ice/ water, for approximately 
5 min. A thermometer cleaned with ethanol is useful to monitor the 
culture temperature to 25 °C. 

4. Put flasks into a prewarmed 25 °C water shaker (225 rpm) and assess 
synchrony by sampling for septation index/flow every 20 min. 



3.7.4. G1/G0 arrest of fission yeast by nitrogen starvation 

Nitrogen starvation at low temperature is used to enrich a population for 
Gl cells, particularly for release into synchronous meiosis using the patl-1 14 
allele. However, cells require approximately 2 h of lag time to reenter the 
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cell cycle. This reentry is not uniform, so cells shows only modest syn- 
chrony compared to other methods. 

1 . Grow a culture in EMM to OD . 8-1 . (approximately 2-3 X 1 cells/ml) . 
Using a flask 5 X larger than the culture volume will enhance aeration and 
starvation at all points of the experiment. 

2. Harvest cells in a sterile conical tube by centrifuging 500 Xg, 5 min, at 
room temperature. Decant supernatant and vortex pellet. 

3. Wash cells twice in 0.5 culture volume EMM — N medium, vortexing 
well. 

4. Resuspend cells in 1 volume EMM — N. If cells are Ade auxotrophes, 
add adenine stock solution to 7.5 mg/1 (1/1000 dilution). Other supple- 
ments are not required. 

5. Shake cells in a 25 °C air shaker for 16—18 h. Past 20 h is not recom- 
mended, as cells will arrest in GO and not release. Temperature above 
25 °C will lead to G2, rather than Gl arrest. 

6. To release cells, add 1 culture volume of EMM(+N) and supplements at 
150 mg/1. The final concentration of NH 4 C1 in solution will be 2.5 g/1, 
half of the usual amount in EMM. Adding an equal volume of EMM 
(+ N + supplements) avoids the medium shock that could occur if a 
large volume of NH 4 C1 solution were added. 

Harvest and fix cells in 70% ethanol for flow and morphology 
(Section 5.1), including an asynchronous culture sample (prepared before 
starving cells), a Gl starved sample (t = 0), and every 15—30 min after 
release. 

Alternates: While temperatures >25 °C will lead to a predominantly G2 
cell population, this is not the recommended method for achieving uniform 
G2 arrest. The tightest G2 synchrony is achieved using the cdc25-22 tem- 
perature sensitive allele (for basic protocol, refer to Section 3.7.3). As an 
alternative, useful for cells with two temperature sensitive alleles or diploid 
cultures (which are sensitive to nitrogen depletion), we recommend a 
variation on the nitrogen starvation protocol above. Removing both glu- 
cose and nitrogen (EMM — N — G) enriches a population for G2 cells, 
and is particularly useful to arrest diploid cells for release into meiosis in the 
absence of nitrogen. 




4. Molecular Analysis 



4.1. Fission yeast plasmids 



Fission yeast plasmids were originally derived from budding yeast vectors, 
but these are maintained poorly in 5. pombe due to inefficient replication, 
high frequency of rearrangement, and poor complementation of 5. pombe 
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auxotrophs by most S. cerevisiae markers. Generally, modern fission yeast 
episomes contain a S. pombe replication origin (ars) and an S. pombe-specific 
nutritional or drug-resistance marker. The exception is the 5. cerevisiae 
LEU2 marker, which efficiently complements 5. pombe leul mutations, 
and is still common. Drug selection markers including hygromycin, 
G418, nourseothricin, and phleomycin are available under 5. pombe-specific 
promoters (Sato et ah, 2005). 

Plasmid copy number varies from a few to 20 (Patterson et ah, 1995). 
There are no S. pombe single-copy or centromere vectors, because the 
fission yeast centromere is too large to encompass on a simple plasmid. 
Without a centromere, lower copy number plasmids may be lost frequently, 
and in some cases 10—20% of a population of cells even under positive 
selection may lack the plasmid of interest. Additionally, plasmids are unsta- 
ble through meiosis and are recovered typically in only 10—20% of spores. 
One solution to the stability issue is to target integration into the genome at 
a known sequence such as leul (Keeney and Boeke, 1994). 

Some plasmids may also carry counter-selectable markers including 
ura4-\- (which can be selected against with 5-FOA, and selected for by 
minus-uracil media), canl-\- (selected against with the toxic arginine ana- 
logue canavanine; Fantes and Creanor, 1984), or adh-thymidine kinase 
(which renders cells sensitive to the thymidine analogue FudR (Kiely 
et ah, 2000)). Of course, any cloned sequence associated with a growth 
defect is functionally a counter-selectable marker. The sup3-5 marker, a 
tRNA that suppresses the nonsense allele ade6-704, is an example of this that 
has been exploited to monitor successful integration. There is a modest 
toxicity associated with this marker on high-copy plasmids due to inappro- 
priate insertion of the wrong tRNA. This leads to a subpopulation of cells 
that lose the plasmid, and are pink on low-adenine media due to the ade6 
deficiency. Conversely, if integrated into the genome, the sup3-5 marker is 
stable, there is no Ade subpopulation, and the colonies are white on low- 
adenine media (Grallert et ah, 1993). 

Expression of a cloned sequence from a plasmid relies on a cloned 
promoter. The two most common constitutive promoters are adh 1 (consti- 
tutive high expression; Broker and Bauml, 1989) and CaMV (constitutive 
low expression, engineered to be repressed with tetracycline; Gmunder and 
Kohli, 1989). Unfortunately, inducible expression options in 5. pombe are 
more limited than in budding yeast. The most popular inducible promoter 
is the nmtl promoter {no message in thiamine), which is repressed in the 
presence of thiamine and is one of the strongest inductions in fission yeast 
(Maundrell, 1993). The nmtl promoter is repressed by the addition of 
thiamine stock solution (2000 X, 10 mg/ml) to 15 fiM or 5 /ig/ml just 
before use. Rich medium (YE/YES) already contains thiamine, as does SD 
medium for budding yeast, and therefore cannot be used for high-level 
protein expression in 5. pombe under the nmtl promoter. Thus, synthetic 
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minimal medium (EMM or PMG) is required, in which thiamine can be 
added as necessary. When transforming a construct using the nmtl pro- 
moter, we recommend including thiamine in growth and maintenance steps 
to prevent unwanted toxicity from gene overexpression. This has an added 
benefit of promoting growth, because thiamine is modestly growth limiting 
and its presence accelerates growth rate. 

To activate nmtl -driven expression, liquid cultures with thiamine are 
washed to remove excess thiamine, by filtration or centrifugation (500 Xg, 
5 min to pellet cells). Cells are washed twice in medium without thiamine, 
then resuspended in an equal volume of medium and grown at appropriate 
temperature for a minimum of 16 h to completely activate the nmtl 
promoter. The long induction reflects the considerable pools of intracellular 
thiamine that must be depleted before the promoter becomes active. How- 
ever, nmtl still has low expression in repressive conditions (+ thiamine); 
therefore, only unstable proteins may be reliably "switched off by adding 
thiamine. Some fine tuning of this system has occurred by isolating muta- 
tions that attenuate both induced and repressed levels making medium- and 
low-level derivatives (nmt* (Tommasino and Maundrell, 1991) and nmt** 
(Basi et al., 1993; Forsburg, 1993)), or titrating thiamine concentration to 
generate intermediate levels of repression (Javerzat et al., 1996; Tommasino 
and Maundrell, 1991). There are numerous versions of nmtl containing 
plasmids with different epitope tags and promoter derivatives (Craven et al. , 
1998; Forsburg and Sherman, 1997; Tasto et al., 2001). 

Additional regulated promoters have been characterized including Jbpl 
(glucose; Hoffman and Winston, 1989), invl (invertase; Iacovoni et al., 
1999), urgl (uracil; Watt et al, 2008), and ctr4 (copper; Bellemare et al., 
2001). Although these have not become as popular as the nmtl promoter, 
which is well characterized and widely employed, some of these promoters 
have the advantage of a faster induction time. However, the constructs 
available using these alternative promoters are not as diverse as those 
available for nmtl, and the kinetics of transcript induction and repression 
from these promoters must be determined by the user for individual 
proteins. 

4.2. Integrations in fission yeast 

The 5. pombe genome is easily modified by integration via homologous 
recombination, at which it is very proficient, although it is also capable of 
performing nonhomologous recombination. Ectopic integration into an 
auxotrophic locus may be accomplished with any plasmid that contains no 
ars and has sufficient sequence to target the event. For example, a plasmid 
containing a construct of interest may be integrated into leul-32 by linear- 
izing the integrating plasmid in the cloned leu 1 gene to target the insertion 
(Keeney and Boeke, 1994). 
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Integrations for precise gene disruptions or replacements (knock-outs 
and knock-ins) require additional planning. Maximum integration effi- 
ciency with minimal background occurs using homologous sequences 
>300 base pairs (bp) (Krawchuk and Wahls, 1999). Short tracts of homol- 
ogy < 100 bp may be used (Bahler et ah, 1998) but can generate a high 
background of nonhomologous events compared to 5. cerevisiae. All homol- 
ogous integration success is dependent upon the specific gene and chroma- 
tin context. Additionally, fragments or plasmids without replication origins 
can sometimes be maintained as unstable concatamers. Thus, appropriate 
screening for locus-specific integration must be carried out, including 
verification of stable integration (2:2 segregation through meiosis). This is 
easily determined using random spore analysis. 

4.3. Isolation and analysis of novel mutations 

Genome mutagenesis can be performed with nitrosoguanidine or ethyl 
methane sulfonate (EMS), although recently nonchemical methods have 
been considered, for example, ultraviolet (UV) radiation or insertional 
mutagenesis (Chua et ah, 2000; Wang et ah, 2004). The general methods 
of chemical or radiation-induced mutagenesis are the same for budding and 
fission yeasts. This requires determination of dose and lethality, screening or 
selection, and subsequent genetic analysis including backcrossing, domi- 
nance and recessiveness testing, complementation, and linkage analysis. For 
detailed considerations to setting up a genetic screen, several excellent 
references are Barbour and Xiao (2006), Barbour et ah (2006), Boone 
et ah (2007), and Forsburg (2001). 

If mutagenizing a plasmid for point mutant analysis of a specific gene, the 
cDNA is first put into an expression plasmid with a marker that will comple- 
ment the strain in question. Mutagenesis can be performed in vitro using 
hydroxylamine (Liang and Forsburg, 2001; Rose and Fink, 1987), via muta- 
genic PCR, or using mutagenic bacteria during amplification (Rasila et ah, 
2009) . The mutagenized plasmid is transformed into cells with a mutant copy 
of the allele (deletion or temperature sensitive mutant) and screened for 
rescue under conditions that would kill cells. Optimizing the screen ahead 
of time is critical to success, and Phloxin B staining helps distinguish colonies 
that are attenuated for growth under appropriate conditions. 

4.4. Transformation of DNA into fission yeast 
4.4.1. Electroporation of fission yeast 

We find electroporation is the fastest and most efficient method for DNA 
transformation into fission yeast. Cells are made electrocompetent by wash- 
ing in cold sorbitol, although sorbitol is removed following electroporation 
as it retards growth during selection. 
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1 . Grow 25—50 ml of fission yeast cells in minimal medium to mid-log 
phase (OD 60 o 0.4—1.0), approximately 1—2.5x10 cells/ml. 

2. Harvest cells in a sterile tube and centrifuge at 500 Xg, 5 min, 4 °C. Keep 
cells and solutions on ice in subsequent steps. 

3. Wash cells once in 0.5 volume of chilled, sterile water. Centrifuge as 
above. 

4. Wash cells once in 0.25 volume of chilled, sterile 1 M sorbitol, and 
centrifuge as above. (Note: 15 min incubation of cells in 1 M 
sorbitol + 25 mM DTT may increase electrocompetence; Suga and 
Hatakeyama, 2001.) 

5. Resuspend cells at 1—5x10 cells/ml in chilled, 1 M sorbitol. Aliquot 
40 fA to prechilled microfuge tubes and add 100 ng DNA. Mix gently 
and incubate 5 min on ice. 

6. Transfer mixture to prechilled 0.5 cm-gap electroporation cuvettes. 
Electroporate cells according to manufacturer. For a BioRad instrument: 
1.5 kV, 200 ft, 25 ju¥. 

7. Immediately add 0.9 ml of cold, 1 M sorbitol, and remove from cuvette 
into a microfuge tube on ice. Keep tubes on ice while all electropora- 
tions are performed. 

8. Centrifuge tubes (3000 x^, 5 min, 4 °C) and remove sorbitol. Wash cells 
once with 1 ml of chilled sterile water. Plate cells as soon as possible onto 
minimal selective medium. Transformants appear in 3—5 days at 32 °C, 
5—7 days at 25 °C. 



4.4.2. Rapid lithium acetate transformation 

This method is quick, independent of special equipment, and uses small 
culture volumes, resulting in transformation efficiency of approximately 10 
transformants per microgram of DNA (Kanter-Smoler et ah, 1994). 

1. Use a fresh colony from a plate and resuspend in 150 fA of 0.1 Mlithium 
acetate (LiOAc), pH 4.9. Incubate 1 h at room temperature. 

2. Add 1 fig DNA and mix (less DNA may be used, but 1 fig is recom- 
mended to start due to the lower transformation efficiency). 

3. Add 350 fA of 50% polyethylene glycol 8000 (PEG8000). Mix, and 
incubate hour at room temperature. 

4. Centrifuge cells 3000 x^, 5 min, at room temperature to pellet. Wash 
cells once with 1 ml of sterile water, then resuspend in a small volume of 
sterile water to plate onto selective medium. 

A more stringent version of this protocol produces higher transforma- 
tion efficiency, and is better when transforming PCR products for homol- 
ogous integration (Bahler et ah, 1998; Forsburg, 2003b). 
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4.5. Harvesting DNA, RNA, and protein from fission yeast 

The methods of isolating nucleic acids and protein are similar to that in 
S. cerevisiae, although the cell wall of 5. pombe can be difficult to crack. The 
lysis step is critical to success, and can be accomplished with enzymes or 
mechanically. In large-scale preparations, fission yeast may be frozen in 
liquid nitrogen and ground (e.g., Retsch RM100 Mortar Grinder; www. 
retsch.com), broken (e.g., Retsch Ball Mill, RMM301) or micro fluidized 
(e.g., Microfluidics M110-S; www.microfluidicscorp.com); these are more 
efficient but more expensive. 



4.5.1. DNA preparation from fission yeast 

This small-scale preparation should deliver approximately 1 fig of DNA per 
OD harvested. Scale up as required. This protocol can also be used to isolate 
plasmids from yeast for transformation into E. coli (Hoffman and Winston, 

1987). 

1. Grow a 5—10 ml culture to OD 600 of 0.8 or higher. Harvest, and wash 
once with 1 X PBS or water, transferring cells to a screw-cap tube. Pellets 
may be stored at — 80 °C. 

2. Add 250 jA of DNA lysis buffer (see Table 32.3), 250 jA of acid- washed 
glass beads and 250 /A of phenol: chloroform: isoamyl alcohol (PCI; 
25:24:1). Vortex for 1 min, rest 1 min, and repeat. 

3. Centrifuge for 5 min at > 13,000 x^ and then move the aqueous top 
phase to a fresh microfuge tube, avoiding the interface. Estimate the 
lysate volume. 

4. Add 250 jA TE (total ~450 jA), and 1 volume of PCI. Vortex well and 
centrifuge 5 min. Recover aqueous layer to a new tube. Repeat PCI 
extraction if necessary (e.g., cloudy interface). 

5. Add 1 ml 100% ethanol, vortex and chill for 15 min on ice or at — 20 °C. 
Centrifuge at top speed for 5 min. 

6. Wash pellet once with 1 ml 70% ethanol (v/v in water). Air dry pellet 
briefly. 

7. Resuspend pellet in 100 jA TE + 20 mg/ml RNaseA (boiled to inacti- 
vate DNases), and store at 4 or —20 °C. 



4.5.2. Colony PCR 

This method uses zymolyase enzyme to digest the cell walls, and then 
spheroplasts are directly used in PCR reactions with gene-specific primers. 
The number of cycles may be increased to detect product from this method. 

1. Aliquot 10 fA of zymolyase solution (Table 32.3) to fresh microfuge or 
PCR tubes. 
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Table 32.3 Buffers for biochemistry 





Composition 


Notes 


DNA lysis buffer 


lOmMTris, pH 8.0 


• For isolation of fission yeast 




lOOmMNaCl 


genomic DNA or episomal 




1 mMEDTA, pH 8.0 


plasmid isolation 




1% SDS 


• Variation: addition of 1% 
Triton X-100 to buffer 


Zymolyase 


2.5 mg/ml zymolyase 


• Aliquots may be stored at 


solution 


20T a 


— 20 °C for > 6 months, 


(colony PCR) 


1.2 M sorbitol 
0.1 M sodium phosphate, 
pH7.4 


but should not be refrozen 


RNA lysis buffer 


50 mM sodium acetate, 


• Due to 1% SDS, this is 




pH4.3 


essentially RNase-free and 




10 mMEDTA 


does not need to be treated 




1% SDS 


with DEPC 


Soluble protein 


50mMTris, pH 7.5 


• Add PMSF fresh before use 


lysis buffer 1 


150mMNaCl 


• May add additional protease 




5 mM EDTA 


or phosphatase inhibitors if 




10% glycerol 


desired 




1 mM 






phenylmethylsulfonyl- 






fluoride (PMSF) 




"STOP" buffer 


0.2% sodium azide 


• May also be made as a 




150mMNaCl 


10 X stock 




10 mMEDTA, pH 8.0 


• Store at 4 °C for less than 




Use ice-cold 


2 months 


Laemmli buffer 


2.5% SDS 


• Acetone-washed pelleted 


part A (1 X 


60 mM Tris base 


protein is solubilized by 


buffer) 


(un-pH'd) 


boiling in this 




6.25 mM EDTA, pH 8.0 


• Add fresh protease inhibitors 




Make up in water, store at 


(Sigma P8125) 




room temperature 


• Add phosphatase inhibitors if 
desired 


Laemmli 


50% glycerol 


•1 ml aliquots made as needed; 


buffer, part B 


100 mM Tris base 


can be stored at — 20 °C for 


(5x buffer) 


0.1-0.05% (w/v) 


several months 




Bromophenol blue 


• Add fresh 1 M DTT to 




400 mM dithiothretol 


50 mM in samples if desired 




10% 2-mercaptoethanol 






(v/v) optional 





Zymolyase 20T from Seikagaku Biobusiness product #120491. 
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2. Use a clean plastic pipette tip to pick up a single yeast colony. Avoid 
toothpicks, which may interfere with the PCR reaction. Touch colony 
onto a numbered patch of a plate (appropriate medium), to keep as a 
master copy. 

3. Transfer the majority of the colony to the Zymolyase solution (step 1), 
and rinse the yeast from the tip into the solution. 

4. Incubate 10 min at 37 °C. Use 2 jA of spheroplasts per 50 jA PCR 
reaction. 



4.5.3. RNA preparation from fission yeast 

This small-scale preparation should yield approximately 2 fig of RNA per 
OD harvested. Scale up as required. Alternatively, reagents such as TRIzol 
(Invitrogen) may be used to purify RNA. After phenol/chloroform steps, 
take care that solutions, tubes, and tips are all RNase free. 

1. Grow 20—50 ml culture to OD 600 of 0.5— 1.0. Harvest approximately 20 
ODs of cells. 

2. Resuspend pellet in 250 fA of RNA lysis buffer (see Table 32.3), and 
transfer to a screw-cap tube. 

3. Add 250 fA of acid- washed glass beads (0.5 mm) and 250 /A of acidic 
phenol (pH 4.7). Vortex for 1 min, rest 1 min, and then vortex 1 min. 

4. Incubate 10 min at 65 °C. Centrifuge tubes for 5 min at >13,000X£. 
Move aqueous top phase to a fresh microfuge tube, avoiding the 
interface. 

5. Add 250 /A TE or RNA lysis buffer, and estimate volume of lysate 
(^450 fA). Repeat organic extraction with 1 volume acidic phenol, 
taking aqueous phase. 

6. Add 1 volume of phenol/chloroform and vortex. Centrifuge and move 
the aqueous layer to a fresh tube. 

7. Add 1 ml 100% ethanol, vortex and chill for 15 min on ice or at — 20 °C. 
Centrifuge at top speed for 10 min, 4 °C. 

8. Wash pellet once with 1 ml 70% ethanol (v/v in water). Air dry pellet 
briefly. 

9. Resuspend pellet in appropriate buffer (i.e., RNase-free water). Store 
samples at —80 °C and do not refreeze. For long-term storage, keep in 
70% ethanol at- 80 °C. 



4.5.4. Soluble protein extract 

This protocol is a starting point for known soluble proteins, which is 
suitable for direct Western blots or immunoprecipitation. A whole-extract 
preparation using TCA follows (Section 4.5.5). Treating cells with a 
"STOP buffer" containing sodium azide will prevent proteolysis or other 
changes during preparative steps, and "stopped" cell pellets in screw-cap 
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lysate tubes may be frozen at — 80 °C until all samples are ready to prepare 
together (at step 3). Solutions and samples should be kept at 4 °C as much as 
possible during lysate preparation. 

1. Grow yeast to mid-log phase and harvest 5—20 ODs in a conical tube. 
Add 10 X STOP buffer to lx, or 0.1 volume of 2% sodium azide 
solution, and incubate culture on ice for 5 min. Centrifuge 5 min, 
500 Xg, 4 °C, and decant supernatant. 

2. Wash cells once with 1 X PBS. Centrifuge as above. Transfer pellet to a 
screw-cap lysate tube in a small volume of PBS and centrifuge to remove 
PBS (3000 xg, 5 min, 4 °C). 

3. Resuspend pellet in 200 fA lysate buffer. Add approximately 500 fA acid- 
washed glass beads (0.5 mm) to the level of the buffer meniscus. 

4. Vortex for 5 min at 4 °C at top power. Alternately, use a bead beater, or 
disruptor (e.g., Fastprep); typically 4—6 pulses, 5.5 m/s, 45 s each, resting 
for 2 min on ice in between. Check microscopically for cell breakage, 
and repeat disruption if required. 

5. Puncture the top and bottom of the screw-cap tube with a red-hot 
needle and place in a collection tube. Centrifuge tubes to collect lysate, 
10 s at top speed in a micro fuge or 500 x^ in a swinging bucket rotor. 
Collect lysate into a fresh tube. 

6. Spin lysate for 5 min, > 13,000 x^, 4 °C, then remove cleared lysate to a 
fresh tube. 

7. Quantitate protein concentration and store remaining lysate at — 80 °C. 



4.5.5. Whole cell protein extract with trichloroacetic acid (TCA) 

TCA protein extraction and precipitation is an alternative to soluble lysates, 
and is used to assess insoluble proteins, such as those that remain chromatin 
bound. Consequently, TCA lysis better represents the total protein pool 
within cells but is not suitable for subsequent immunoprecipitation. 

1. Prepare cells as in the preceding protocol to step 2. Add 400 fA of 20% 
TCA (v/v in water) to pellet and approximately 500 fA of acid- washed 
glass beads. Disrupt as in step 4 above, and then puncture tube top/ 
bottom with a needle and place in a collection tube. 

2. Centrifuge tubes to collect lysate, 10 s at top speed in a micro fuge or 
500 x^ in a swinging bucket rotor. 

3. Add 800 fA of 5% TCA (v/v in water) to wash beads. Repeat step 2. 

4. Harvest all TCA lysate (now 10% TCA, v/v). Centrifuge >13,000X£, 
5 min, room temperature, to pellet protein. Remove supernatant. 

5. Add 1 ml of acetone and vortex well, and then rock 5 min at room 
temperature. Centrifuge >13,000x^ in a microfuge, 3 min, room 
temperature, and remove acetone wash. Repeat twice, for a total of 
three acetone washes. 
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6. Air-dry protein in a fume hood for 10—30 min until completely dry. 

7. Add 200 jA of Laemmli buffer A (see Table 32.3) with fresh protease 
inhibitors. Boil 3—5 min in a heat block at 100 °C. TIP-tubes will pop 
open, close caps with holders. 

8. Centrifugate >13,000X£, 3 min then remove lysate to a fresh tube. 
Repeat if lysate is cloudy. 

9. Quantitate protein by BCA assay and store lysates at — 80 °C. For SDS— 
PAGE, make up equal protein amounts in Laemmli buffer A, adding 5 X 
Laemmli buffer B (Table 32.3) to 1 X. Heat samples at 95 °C for 5 min 
and then centrifuge for 5 min > 13,000 x^ before loading for 
electrophoresis. 




5. Cell Biology 



5.1. Preparation and analysis of cell populations 



In this section, we describe methods to fix and prepare cells for DNA 
quantitation by flow cytometry, nucleus and septum staining, and basic 
immunofluorescence. 

5.1.1. Ethanol fixation of cells for flow cytometry (FACS) and 
septation index 

Fission yeast cells fixed in 70% ethanol may be stored at 4 °C for > 1 year. 
The simplest and quickest method to fix cells, best for a timecourse with 
closely spaced collection points, is to prepare 1 ml of 100% ethanol into 
labeled microfuge tubes and chill to 4 °C ahead of time. Then, 420 jA of 
culture is added directly to the tube and vortexed, bringing ethanol to 
70%. For larger culture volumes: 



r\j 



1. Remove appropriate amount of culture. Centrifuge at 500 x^, 5 min. 
Decant. 

2. Vortex the pellet well to resuspend cells. 

3. Add an equal volume of cold 70% ethanol (v/v in water, prepared ahead 
and stored at —20 °C) very slowly and vortexing continually, then store 
samples at 4 °C, minimally 15 min. 



5.2. Flow cytometry (whole cells) 

This is the most straightforward method to assess DNA content in fission 
yeast samples. However, the cell cycle of S. pombe must be considered when 
interpreting the data: cells do not separate until after S phase, meaning that 
an asynchronous haploid population displays a predominantly 2C DNA 
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content, and S phase is commonly detected as a shift from 2C to 4C. The 
key to data interpretation is including control samples that are prepared with 
other samples at the same time, including asynchronous (G2 = 2C), nitro- 
gen starved (Gl = 1C), and diploid (2C/4C) controls. Samples may be 
prepared directly in flow cytometer tubes, or in microfuge tubes and 
transferred to flow tubes later. For additional protocols and considerations, 
refer to Sabatinos and Forsburg (2009). 

1. Remove 300 jA of cells fixed in 70% ethanol, and pellet. Wash cells 
twice with 1—3 ml of 50 mM sodium citrate (made in water from a 
0.5-M sodium citrate autoclaved stock solution), vortexing well with 
each wash. 

2. Resuspend cells in 0.5 ml of 50 mM sodium citrate + 0.1 mg/ml RNase 
A (prepared as a 10 mg/ml stock and boiled to kill DNases). Vortex. 
Incubate for 2 h at 37 °C. 

3. Add 0.5 ml of 50 mM sodium citrate + 1 /iMSytox Green. Vortex and 
incubate for 15 min or longer at 4 °C or on ice. Samples may be sealed 
and stored at 4 °C for several months, protected from light, and rerun at 
a later date. 

4. Just before running, sonicate samples with a micro tip sonicator for 5 s on 
low to medium power. Samples are acquired using FL1 for green 
fluorescence. On a Becton Dickinson FACScan flow cytometer with 
488 nm excitation, guideline settings are FSC voltage E00, gain 1.36, 
linear mode; SSC voltage 300, gain 2.47, linear mode; FL1 (Sytox 
fluorescence) voltage 455, gain 1.0, logarithmic mode; FLl-Area gain 
3.64 gain, linear mode; FL2-Wide gain 3.60 gain, linear mode. 



5.3. Flow cytometry (nuclear "ghosts") 

Due to the replication and division cycle of fission yeast, the cellular 
volume, shape, and mitochondrial DNA content may affect DNA profiles, 
making it difficult to distinguish a single cell particle with a 2C nucleus from 
a single cell particle with two 1C nuclei (Sazer and Sherwood, 1990). 
By digesting fixed cells with enzymes, the cell wall and membrane are 
removed, leaving just the nuclei (Carlson et ah, 1997). Appropriate controls 
are the same as for whole cell FACS, but must be prepared as nuclei 
alongside samples. 

1. Place 1x10 ethanol fixed cells in a microfuge tube (typically 1 ml of 
fixed culture) and centrifuge samples at 3000 x^, 5 min to pellet cells. 

2. Wash cells once with 1 ml of 0.6 MKC1. Vortex, and centrifuge. 

3. Resuspend cells in 1 ml of 0.6 M KC1 + 0.5 mg/ml zymolyase 
20 T + 1.0 mg/ml lysing enzymes. Vortex, and incubate for 30 min 
at 37 °C. 
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4. Centrifuge cells as above, and resuspend pellet in 1 ml of 0.6 M 
KC1 + 0.1% Triton X-l 00. Rotate samples for 5 min at room temperature. 

5. Centrifuge cells, and wash once in 1 ml of 1 X PBS. 

6. Resuspend washed cells in 1 ml PBS + 0.1 mg/ml RNaseA. Vortex and 
incubate at 37 °C for 2 h to overnight. 

7. Sonicate cells to release nuclei without fragmenting them. This will 
require optimization for each sonicator; we use 50% amplitude for 5 s 
with a digital Branson 5 mm microtip sonicator. Nuclei may be stored at 
4 °C for several months. 

8. Stain by mixing 100 fA of nuclei with 400 fA of PBS + 1 fiM Sytox 
Green. Vortex and incubate 15—30 min in the dark, then run samples. 
Guideline settings are similar to those for whole cells (Section 5.2) with 
the following exceptions to account for the smaller size of nuclei: FSC 
voltage E00, gain 5.74, linear mode; SSC voltage 304, gain 6.55, linear 
mode. 



5.4. DNA and septum staining in fixed cells 

Septation index may be monitored in real time, or using fixed cells and 
calcofluor staining. This is a good metric for cell synchrony as well as an 
index of S phase in a population, since S phase corresponds with the peak of 
cytokinesis in exponentially growing cells. Staining fixed cells is useful for 
cell morphology and is essential to monitor mitotic index, chromosome 
segregation, and meiotic progression using DAPI staining. 

1. Resuspend ethanol fixed cells, and remove 100 fA to a fresh microfuge 
tube. Wash cells with 1 ml water to rehydrate, then resuspend in 
50-100 fA water. 

2. Pipette 5 fA of cells onto a poly-lysine coated or charged slide. Heat fix 
cells on a slide warmer set at low for a few minutes until barely dry 
around the edges, or let air-dry. Do not over dry! 

3. Add 5 fA of mounting solution (50% glycerol in water (v/v) with 1 mg/ml 
p-phenylenediamine (PPD), 1 /ig/ml DAPI, 0.2 mg/ml calcofluor; will 
keep a few days, but best prepared fresh). Cover with a slip, seal with 
VALAP or nail polish, and visualize using a microscope with UV excita- 
tion source. 

4. Alternately, resuspend rehydrated cells (from step 1) in 100 fA of water 
with 1 /ig/ml DAPI and 0.2 mg/ml calcofluor. Mix for 5 min at room 
temperature in the dark, then wash cells three times with 1 ml water. 
Resuspend cells in 50—100 fA water and mount as in step 2. Mount 
solution is 50% glycerol in water (v/v) with 1 mg/ml PPD as antifade. 

Calcofluor staining can be difficult to optimize, and is often very bright. 
If this is the case, let the sample photobleach briefly and reexpose to 
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photograph nuclei and septa. Calcofluor is prepared in 50 mM sodium 
citrate, 100 mM sodium phosphate, pH 6.0, for long-term storage at 
— 20 °C (centrifuge aliquots to remove particulates before use). VALAP is 
a (1:1:1, w/w/w) mixture of petroleum jelly, lanolin, and paraffin that are 
combined in a beaker, and melted on low heat to form a suspension used to 
seal slides. 

5.5. Fission yeast whole-cell immunofluorescence 

Protocols for S. pombe immunofluorescence are similar to other cells, 
although the fission yeast cell wall is somewhat resistant to digestion. The 
critical parameter is getting samples into fixative quickly, and consistency of 
fixation between experiments. The choice of fixative depends on the 
purpose and the epitope stability. For additional considerations on micros- 
copy and immunofluorescence using fission yeast, refer to Green et al. 
(2009). 

1. Harvest cells, and centrifuge 500 Xg, 5 min to pellet cells. Decant. 
Vortex cell pellet and add an equal volume of fixative solution (refer 
to Table 32.4). Incubation time and temperature is fixative-dependent. 
Centrifuge cells, and wash pellet twice in 1 X PBS. Store cells up to 
6 months at 4 °C in 1 X PBS. Ethanol/methanol cells may be stored in 
fixative at 4 °C, and washed well before step 2. 

2. Wash cells once with 1 ml 1 X PBS, and then 1 ml 1 X PEM (100 mM 
PIPES, pH 6.9, 1 mMEGTA, 1 mMMgS0 4 ; may be made as a 10 X 
stock, filter sterilized and diluted to 1 X as required). 

3. Resuspend cells in 1 ml of PEMS (100 mM PIPES, 1 mM EGTA, 
1 mMMgS0 4 , 1.2 M sorbitol. pH to 6.5-6.9; filter sterilize and store at 
room temperature) with digesting enzymes 0.2 mg/ml lysing enzymes 
(Sigma #L1412) and 0.5 mg/ml zymolyase 20 T (Seikagaku Biobusi- 
ness #120491). Incubate for 10—30 min at room temperature. Check 
digestion microscopically (cells will lose refractive halo), and by adding 
a drop of 1% SDS to the slide (to induce lysis). Complete digestion is 
not required, merely enough to allow antibodies to enter/ exit. 

4. Wash cells three times in 1 ml PEMS. 

5. Block cells in 1 ml PEMBAL (100 mM PIPES, 1 mMEGTA, 1 mM 
MgS0 4 , 3% BSA, 0.1% NaN 3 , 100 mM lysine hydrochloride) 
> 30 min at room temperature on a rocking platform. 

Note: blocking agent, BSA may be changed to fetal calf serum, or a 
combination of BSA and FCS. NaN 3 inhibits fungal growth and 
permits extended incubations at room temperature, but is optional if 
solutions are made fresh and used quickly, and if overnight incubations 
are performed at 4 °C. Lysine is optional, and reduces background 
staining by blocking negative charges, particularly in the nucleus. The 



Table 32.4 Fixatives commonly used in fission yeast immunofluorescence 



Fixative* 


Composition 


Conditions 


Recommended uses 


70% ethanol 


70% ethanol (v/v) in water 


Store at —20 °C. Add while 


Fast, efficient fixative for DNA 






vortexing cells to reduce 


quantitation (e.g., nuclei with 






clumping^ 


DAPI or flow cytometry) . May 
destroy fluorescent protein 
fluorescence (see methanol) 
Good fixative for tubulin 
immunofluorescence 


Methanol 


100% methanol (v/v) 


Store at —20 °C, and add to 


Preservative for cytoplasmic 






harvested cells while 


microtubule and actin 






vortexting. Fix for 15 min 


architecture. May destroy GFP 






or longer and store at 4 °C C 


fluorescence; use anti-GFP 
antibody. Long-term incubation 
may destroy nuclear architecture 


Methanol/ 


3.7% formaldehyde (v/v) 


Use 1 culture volume of 


Good starting fixative for 


formaldehyde 


10% methanol (v/v) 


fixative, rocking for 


immunofluorescence 




0.1 M potassium phosphate, 


15—30 min. Store fixed cells 


experiments. May destroy GFP 




pH6.5 


in PBS at 4 °C long-term c 


fluorescence (use anti-GFP 






Store leftover fixative at room 


antibody) 






temperature, in the dark 


Good for tubulin staining and 
nuclear morphology 


Methanol/acetic acid 


25% methanol (v/v) 


Resuspend harvested cells in 


Reported to be a good 


(MAA) 


75% glacial acetic acid (v/v) 


fixative and incubate for 


chromosome structure fixative, 






15 min. Replacing fixative 


but less successful for 






with fresh MAA after 5 min 


cytoplasmic staining. 






may enhance fixation^ 





(continued) 



Table 32.4 (continued) 








Fixative a 


Composition 


Conditions 


Recommended uses 


Parafo rmaldehy de e 


4% paraformaldehyde (w/v) 


Fix for 15 min to start (up to 


General fixative 


(4%PF A/PBS) 


made up in water or 1 X 


30 min), rocking at room 


4%PFA/PBS may fix enough of 




PBS^ 


temperature 


GFP structure while retaining 




Best if made and used fresh, 


Remove PFA/PBS, wash 


GFP fluorescence for flow 




although unused aliquots 


once with 1 volume of PBS, 


cytometry /immunofluores ence . 




may be stored at —20 °C 


then incubate in 0.2% 


This requires titration of PFA 




wrapped in foir 


Triton X-100 (v/v in 
water) to permeabilize cells, 
15 min, room temperature 


concentration, from 0.1% to 4% 



General word of caution: fixatives are dangerous, use carefully and dispose of waste as dictated by regional guidelines. Methanol, PFA, and formaldehyde are all toxic. PFA and 

formaldehyde are potential carcinogens. 

The length of exposure of the fixative to the cells may cause epitope destruction. Fixation conditions must be optimized for each experiment/ epitope and recorded for 

reproducibility between experiments. 

Permeabilization is generally not required after alcohol fixation. 

For 500 ml of methanol/fo rmaldehy de buffer mix: 16.9 ml 1 MK 2 HP0 4 , 33.1 ml KH 2 P0 4 , 50 ml 100% methanol, 50 ml 37% formaldehyde and bring to 500 ml with water. 

Quenching aldehydes is generally not required for fixatives other than glutaraldehyde. 
J 4% PFA solution is made by mixing 2 g of PFA powder in 45 ml water + 5 ml of 10 x PBS stock. Add 10 /A of 1 N NaOH, and stir in a double boiler (60—80 °C) to bring the 

powder completely into solution. Do not overheat! Once in solution, take pH of solution — optimal range is 7.2-8.0. PFA is toxic and caustic — perform all steps in a fume hood 

and dispose of waste appropriately. 
g PFA may be made in 1 x PBS, as well as 1 x PEM buffer, or HEPES. 

PFA solution stored at — 20 °C may precipitate upon thawing. If this is the case, heat to 60 °C in a water bath or heat block, and vortex precipitate into solution. Do not refreeze 

aliquots. 
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choice of blocking reagent may be changed depending on the protein 
under study (suggested alternates: 10% FCS in 1 X PBS, or 5% FCS in 
PEMBAL). 

6. Split samples into tubes for primary antibody incubation, and centri- 
fuge to pellet cells. Prepare primary antibody solution in blocking 
buffer, 200—500 jA for each sample if kept in microfuge tube. Incubate 
for 2 h to overnight on a rotator. 

7. Wash cells 3 X 1 ml PEMBAL, rotating for 10 min at room temperature 
with each wash. 

8. Add secondary antibody in PEMBAL (200-500 jA volume per tube), 
and incubate 1—2 h in the dark at room temperature. We typically use a 
dilution of 1:250— 1:500 for secondary antibodies under these incuba- 
tion conditions, but the optimal concentration should be determined 
by the user and checked for background and nonspecific signal using 
appropriate controls. We do not recommend overnight incubations in 
secondary antibody. 

9. Wash 2x 1 ml PEMBAL, incubating 10 min on a rotator in the dark 
with each wash. Wash once in 1 ml PEMBAL with 1 /ig/ml DAPI, 
freshly prepared. Rotate 10 min in the dark then harvest cells. 

10. Resuspend the cells in 1 ml PEMBAL to briefly wash then centrifuge to 
pellet cells. Add a small volume of sterile water, and spot 5—10 jA onto a 
poly-L-lysine treated coverslip. Heat fix briefly, until liquid starts to 
evaporate, and wick away excess with a tissue. Mount the coverslip on 
a slide with mount (50% glycerol, 1 mg/ml PPD). Seal slides with nail 
varnish or VALAP. Store slides at — 20 to 4 °C, protected from light. 




6. Conclusion 

S. pombe methods use microbiology techniques common to S. cerevi- 
siae, with unique 5. pombe modifications. This chapter, along with a friendly 
community and numerous online resources, makes it easy for novices to 
begin analysis of this system and "go fission." 




Appendix: Schizosaccharomyces pombe 
Online Resources 

General fission yeast resources include PombeNet (www.pombe.net) 
and the Sanger Centre web site for genome data. There are now many strain 
resources for complementing and studying fission yeast phenotypes. The 
FYSSION collection, housed at the University of Sussex, UK, curates 
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libraries of temperature sensitive mutants and nonessential deletion mutants 
(Armstrong et ah, 2007). This also includes the Bioneer fission yeast deletion 
library, which is available commercially. Another useful resource is the 
GFP-tagged ORFeome; all 5. pombe ORFs tagged with GFP, to track 
protein localization in live cells response to conditions available from the 
RIKEN centre (Matsuyama et al, 2006). Resources for analysis such as 
ortholog mapping (YOGI; Wood and Bahler, 2002) and interactions (e.g., 
BioGRID; Breitkreutz et ah, 2008) are increasingly powerful, and many are 
housed or linked to the Sanger Genome Center fission yeast resources. 



Name 



URL 



General information, links, resources 

PombeNet: Protocols, http://www.pombe.net/ 

resources, people, 

general information 

(Forsburg Lab) 
Sanger Centre 

NIH 5. pombe page 

Genome information 
GeneDB (Genome 

browser and database) 
Epigenome home page 

(Grewal lab) 
Sequences of related 

species Fungal genomes 

at the Broad Institute 
Expression data 
Bahler lab data, 

UCL (UK) 
Gene Expression Viewer 

Transcriptome Viewer 

Software links 

(disruption, tagging) 
Strains and plasmids 
ATCC 



http : //www. Sanger. ac.uk/Proj ects/ 

S_pombe/links.shtml 
http://www.nih.gov/science/models/ 

Schizosaccharomyces/index.html 

http : //genedb . org/genedb/pombe/ 

http://pombe.nci.nih.gov/genome/ 

http : //broadinstitute. org/science/proj ects/ 
fungal-genome-initiative/ 

http://www.bahlerlab.info/ 

http://www.bahlerlab.info/cgi-bin-SPGE/ 

geexview 
http://www.bahlerlab.info/Transcriptome 

Viewer/ 
http://www.bahlerlab.info/resources 



Bioneer (commercial 
gene deletions) 



http://www.atcc.org/ 

Schizosaccharomycespombe/tabid/680/ 

Default, aspx 
http://www.bioneer.com 
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National Collection of http://www.ncyc.co.uk/ 

Yeast Cultures 

(Norwich, UK) 
RIKEN, GFP-tagged http://yeast.lab.nig.ac.jp/nig/index_en.html 

ORFeome 
Yeast Resource Centre, http://yeast.lab.nig.ac.jp/nig/english/ 

Japan index_en.html 

Email list (list-serv) 
Pombelist (community http://lists.sanger.ac.uk/mailman/listinfo/ 

list-serv) pombelist 
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Abstract 

The basidiomycete yeast Crytococcus neoformans is a prominent human patho- 
gen. It primarily infects immunocompromised individuals producing a meningo- 
encephalitis that is lethal if untreated. Recent advances in its genetics and 
molecular biology have made it a model system for understanding both the 
Basidiomycota phylum and mechanisms of fungal pathogenesis. The relative 
ease of experimental manipulation coupled with the development of murine 
models for human disease allow for powerful studies in the mechanisms of 
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virulence and host responses. This chapter introduces the organism and its life 
cycle and then provides detailed step-by-step protocols for culture, manipula- 
tion of the genome, analysis of nucleic acids and proteins, and assessment of 
virulence and expression of virulence factors. 




1. Introduction 

Although members of the Ascomycota phylum, particularly Sacchro- 
myces cerevisiae, are the most studied fungi, there are 80,000 known species of 
the Fungi kingdom. There is a great deal of diversity in the kingdom, 
ranging from small harmless unicellular yeast such as S. cerevisiae to the 
great plant pathogen Armillaria ostoyae, one of the largest organisms in the 
world. This latter species is a member of the Basidiomycota phylum, a 
phylum less well understood than Ascomycota. 

While no basidiomycete species has been studied in as much detail as 
S. cerevisiae, it is a fascinating and diverse group of organisms. Basidiomy- 
cetes produce many interesting secondary metabolites used in medicine, 
industry, and research. Members of the phylum account for about 10% (40 
species) of known human fungal pathogens (Morrow and Fraser, 2009). 
With the onset of the AIDS epidemic, one basidiomycete in particular, 
Cryptococcus neoformans, has risen from a little-known pathogen to one of the 
top fungal killers of immunocompromised patients. 

C. neoformans is primarily found as a haploid yeast, and is widely present 
in the environment worldwide, including in avian excreta, soil, and tree 
bark. Studies have shown that humans come into frequent contact with 
C. neoformans: individuals with no history of cryptococcosis possess anti- 
bodies against the yeast (Chen et ah, 1999), and most children appear to 
have been exposed by the age of five (Goldman et ah, 2001). This suggests 
that the majority of individuals encounter C. neoformans in the environment, 
most likely through inhalation into the lungs. Immunocompetent indivi- 
duals are usually able to control and contain the infection, often leading to 
an asymptomatic latent state of infection. If the patient's immune system 
becomes compromised at a later date, the latent infection can reactivate. In 
the case of the immunocompromised individual, pulmonary infection can 
lead to pneumonia followed by dissemination via the bloodstream to other 
organs. C. neoformans is one of only a few fungal species known to cross the 
blood— brain barrier and infect the brain (Kim, 2006), leading to meningitis 
that is fatal if left untreated. When the AIDS epidemic began in the 1980s, 
there was a concomitant surge in cryptococcosis cases worldwide. In recent 
years, the increased usage of antiretro viral therapy and antifungals has 
reduced the overall incidences of fatal cryptococcal meningitis. Yet in 
areas where access to treatment is limited, C. neoformans remains an 
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important concern in the care of the immunocompromised, including 
AIDS, cancer, and organ transplant patients. In addition, recent outbreaks 
of cryptococcosis in immunocompetent individuals in the Pacific North- 
west raise concerns about the risk of cryptococcal infection even in other- 
wise healthy individuals (Bartlett et ah, 2008; Hoang et ah, 2004). 

As a haploid yeast cell, C. neoformans is amenable to many of the 
extensive protocols that have been developed for S. cerevisiae, requiring in 
most cases only a few adjustments. However, having diverged from the 
ascomycete lineage some 400 million years ago (mya) (Taylor and Berbee, 
2006), there are significant differences in its cellular machinery and life cycle 
(see below). Comparative genomics promises to yield rich information 
about the evolution of shared and diverged genes, proteins, and pathways, 
as well as offering insight into the differences between species that allow one 
yeast to exist as a benign saprophyte and another to cause lethal infection in 
a mammalian host. 




2. Serotypes, Strains, and Sequences 

C. neoformans is classified into four different serotypes based on its 
reactivity with monoclonal antibodies to surface capsular polysaccharide 
(Kabasawa et ah, 1991). These serotypes have historically been further 
classified into three different varieties: var. neoformans (serotype D), var. 
grubii (serotype A), and var. gattii (serotypes B and C). However, in recent 
years, var. gattii has been proposed to comprise its own species as Cryptococcus 
gattii, based on morphological and biochemical evidence (Kwon-Chung 
and Varma, 2006). C. neoformans var. neoformans and var. grubii primarily 
infect immunocompromised individuals, with var. grubii causing ~ 99% of 
cryptococcal infections in HIV-infected patients (Mitchell and Perfect, 
1995). C. gattii has the ability to infect immunocompetent individuals, as 
evidenced by an emergent outbreak in the Pacific Northwest that has 
resulted in hundreds of human and veterinary infections. Based on analysis 
of mutation frequency in conserved genes, it is thought that C. neoformans 
and C. gattii diverged about 37 mya, while C. neoformans var. neoformans and 
var. grubii split 18.5 mya, and within C. gattii, serotypes B and C diverged 
9.6 mya (Xu et ah, 2000). To date, the genomes of five strains of 
C. neoformans and C. gattii have been sequenced to at least 6x coverage: 
JEC21 (serotype D), B-3501 (serotype D), H99 (serotype A), WM276 
(serotype B), and R265 (serotype B). H99 and R265 are clinical isolates, 
while WM276 was isolated from the environment. JEC21 and B-3501 are 
laboratory-derived strains, where JEC21 was derived from B-3501 through 
a series of crosses and backcrosses (Heitman et ah, 1999), and their genomes 
are 99.5% identical (Loftus et ah, 2005). 
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Online resources for C. neoformans genome sequences 

Strain URL 

JEC21 http://www.tigr.org/tdb/e2kl/cnal/ 

B-3501 http://www-sequence.stanford.edu/group/Cneoformans/ 

H99 http://www.broad.mit.edu/annotation/fungi/ 

cryptococcus_neoformans 
WM276 http://www.bcgsc.ca/gc/cryptococcus/ 
R265 http : // www. broad, mit. edu/annotation/fungi/ 

cryptococcus_neoformans_b/ 

The genomes of C. neoformans and C. gattii contain about 19 Mb of 
DNA spread over 14 chromosomes with about 7000 predicted protein- 
coding genes. The genomic sequence is relatively GC-rich (48% GC 
content) when compared to the genome of S. cerevisiae (38% GC content). 
Nucleic acid enzymatic protocols from 5. cerevisiae laboratories that have 
been adapted for use with C. neoformans take this into account with the 
addition of DMSO (5% final concentration) or betaine (1.3 M final con- 
centration) to resolve secondary structures resulting from the higher GC 
content. 

Unless otherwise noted, the use of "C. neoformans" in the text of this 
chapter refers to C. neoformans var. neoformans and var. grubii. Although 
many of the same techniques are applicable to C. gattii, their usage is less 
well documented and may require additional adaptations. 




3. Life Cycle 

C. neoformans and C. gattii primarily exist as haploid yeast cells that 
reproduce asexually through budding. They also possess a bipolar mating 
system, with mating types a and a. The mating (MAT) locus regulates the 
sexual cycle and encodes for more than 20 genes, including genes for cell 
type identity and the production and sensing of pheromone. Similar to S. 
cerevisiae, MATa cells produce MFa pheromone that is sensed by MAToc 
cells. In response to pheromone, MAToc cells produce a conjugation tube 
(Fig. 33.1 A). Likewise, MATa cells respond to the MFa pheromone pro- 
duced by MAToc cells, although the response of MATa. cells is to form large 
swollen cells that can then fuse to the conjugation tubes of the MAToc cells. 
The MATa and MAToc nuclei divide, and the MAToc nuclei travel through 
the conjugation tube into the MATa cell. MATa and MAToc nuclei move 
into the hypha formed by the MATa cell, and a septum forms between the 
hypha and the MATa cell. The hypha may then elongate through cell 
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growth and division. During hyphal elongation, the nuclei divide mitoti- 
cally (Fig. 33. IB). One nucleus divides in such an orientation as to enter 
into a bulge in the cell wall that will later form a clamp connection (i). Septa 
form between the posterior cell wall, the tip of the hypha, and the clamp 
connection, leaving one nucleus in the posterior cell, two nuclei in the tip 
of the hypha, and one nucleus in the clamp (ii). The clamp fuses back to 
merge with the posterior cell (hi), allowing the nucleus present in the clamp 
to join the nucleus in the posterior cell (iv). During sporulation 
(Fig. 33.1 A), a basidium forms at the tip of the hypha. In the basidium, 
the MATa. and MATot nuclei fuse and undergo meiosis. The new MATa 
and MATot nuclei then undergo repetitive rounds of mitosis, eventually 
forming four chains of spores that emerge from the basidium. The spores are 
then dispersed, and germinate into haploid yeast cells (Bovers et ah, 2008; 
McClelland et ah, 2004). 

In the laboratory, mating is achieved through nitrogen starvation on V8 
medium, consisting of 5% (v/v) V8 juice, 3 mMKH 2 P0 4 , 4% (w/v) agar, 
pH 5.0. For technical details, several excellent studies have been performed 
examining mating conditions; we refer you to these (Escandon et ah, 2007; 
Nielsen et ah, 2003; Xue et ah, 2007). 

Haploid fruiting has also been observed, where cells of one mating type 
become diploid and form hyphae. These monokaryotic hyphae are char- 
acterized by unfused clamp connections. Similar to mating, monokaryotic 
hyphae also form basidia, undergo meiosis, and sporulate (Lin et ah, 2005; 
Tscharke et ah, 2003; Wickes and Edman, 1995). 




4. Techniques for Basic Culture 

C. neoformans is classified as a Biosafety Level 2 (BSL-2) organism, and 
as such does not require elaborate biohazard safety facilities. Current pre- 
caution recommendations include the use of a Class I or Class II biological 
safety cabinet for manipulation of environmental samples or spore forms. 
Incidences of infection in laboratory personnel are rare, limited in the 
literature to skin puncture accidents with needles heavily contaminated 
with C. neoformans (Casadevall et ah, 1994). 

C. neoformans may be cultured using similar medium to that used in 
S. cerevisiae cultivation. Common media used include YPAD and YNB. For 
some assays (e.g., see Sections 6.1.6 and 6.3.2), Sabouraud dextrose medium 
is used for culturing for historic reasons and its promotion of yeast growth 
over bacterial growth. 
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Composition 

YPAD (yeast peptone adenine dextrose) 

1% bacto yeast extract (Becton Dickinson, Cat. No. 10 g 

212720) 

2% bacto peptone (Beckton Dickinson, Cat. No. 20 g 

211820) 

2% glucose 20 g 

0.73 mM L-tryptophan (Sigma, Cat. No. T8941) 0.15 g 

0.27 mM adenine (Sigma, Cat. No. A2786) 0.037 g 

Water to 1 1 

YNB (yeast nitrogen base) 

0.15% YNB w/o amino acids, w/o dextrose, w/o 3 g 

ammonium sulfate (BIO 101, Cat. No. 4027-032) 

75 mM ammonium sulfate 10 g 

2% glucose 20 g 

Water to 1 1 

Sabouraud dextrose 

3% Sabouraud dextrose broth (Becton Dickinson, 30 g 

Cat. No. 238210) 
Water to 1 1 

Standard growth is performed in YPAD medium, typically at 30 °C, 
with the alternative use of the defined medium YNB. During logarithmic 
growth in YPAD medium at 30 °C, the doubling time of wild-type 
C. neoformans is approximately 110 min. Consistent with its role as a 
human pathogen, C. neoformans also grows robustly at 37 °C. Unlike 
5. cerevisiae, C. neoformans does not perform fermentation, and therefore 
requires a minimal amount of oxygen for growth. C. neoformans is sensitive 
to alkaline pH, growing poorly at pH 9. However, it is insensitive to acidic 
pH, exhibiting normal doubling times in conditions as low as pH 3. 

Frozen stocks of C. neoformans can be maintained in 15% glycerol 
solution at —80 °C. These stocks may be revived following transfer by 
sterile applicator stick to a YPAD plate. 

4.1. Dominant drug selection markers 

Our laboratory and others have used resistance to nourseothricin (NAT), 
G418, and hygromycin for selection in C. neoformans. 

In the plasmids pHL001-STM-# and pJAFl, the genes encoding for 
proteins conferring resistance to NAT and G418 have been inserted in 
between the promoter element of C. neoformans ACT1 and the terminator 
element of C. neoformans TRP1 (both of these sequences were derived from 
the H99 strain) (Table 33.1). In the plasmid pHYG7-KBl, the gene 
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Table 33.1 Dominant drug selection markers 



Drug selection 


Plasmid name 


Structure 


Reference 


Nourseothricin 


pHL001-STM-# 


CnACTl 


Gerik et al. 






promoter- NA T 
-CnTRPl term 


(2005) 


G418 


pJAFl 


CnACTl 


Fraser et al. 






promoter- NEO 
-CnTRPl term 


(2003) 


Hygromycin 


pHYG7-KBl 


CnACTl 

promoter-HYG 
-CnGAL7 UTR 


Hua et al. 

(2000) 



encoding for resistance to hygromycin was inserted between the promoter 
element of C. neoformans ACT1 and the untranslated region (UTR) of 
C. neoformans GAL7 (where the ACT1 sequence was derived from H99 
and the GAL7 sequence was derived from JEC21). 

For selection of yeast containing the appropriate drug resistance cassette, 
we use YPAD agar plates made with 0.1 mg/ml nouseothricin (clonNAT, 
Werner Bio Agents), 0.2 mg/ml G418 (VWR, Cat. No. 45000-626), and/ 
or 0.3 mg/ml hygromycin (Sigma, Cat. No. H7772). 




5. Basic Molecular Biology Techniques 

5.1. Fusion polymerase chain reaction 

Manipulation of the genomic sequence of a species is a powerful tool for 
analyzing the importance of specific genes in the function of the organism. 
Homologous recombination, or the integration of an exogenous DNA 
construct into the genome, is a crucial step in the site-directed mutagenesis 
of a target gene. C. neoformans performs homologous recombination at 
relatively low frequencies when compared with other fungi (1—4% as 
compared to nearly 100% in S. cerevisiae) but transformation with linear 
constructs flanked by a significant amount (0.3—1 kb) of sequence homolo- 
gous to the genome creates stable integrants reproducibly (Davidson et al., 
2000; Nelson et al., 2003). For example, a linear construct to target a gene 
for deletion might contain an antibiotic resistance cassette flanked on the 5'- 
and 3'-ends with 1 kb sequences homologous to the 5'- and 3'-ends of the 
targeted gene. 

Construction of linear constructs for homologous recombination uses 
a procedure known as fusion PCR or PCR overlap (Davidson et al., 2002). 
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In this process, two or more DNA fragments are joined together during the 
polymerase chain reaction (PCR) by virtue of a shared region of homology. 
This region of homology is engineered during previous PCR steps using 
primers containing linker sequences that are then shared between the two 
fragments to be fused together (Fig. 33.2). 

5.1.1. Fusion PCR for targeted gene deletion 

As mentioned previously, a linear construct for targeted gene deletion is 
designed to contain an antibiotic resistance cassette flanked by 1 kb 
sequences homologous to the targeted sequence. To create this construct, 
an antibiotic resistance cassette, such as resistance to NAT, is first amplified 
with primers containing 22 bp of homology to the 5' and 3' ends of the 
cassette, and 21 bp of linker sequence which is different for the 5' and 3' 
primers (these primers are designated primers 3 and 4 in Table 33.2 and 
Fig. 33. 2A). Then, from genomic DNA we amplify 1 kb of sequence 
upstream and downstream of the ORF targeted for deletion. We term 
these sequences the 5' and 3' flanks for the targeted gene deletion construct. 
For the 5 7 flank, the forward primer (1) is 22 bp of exact homology to the 
genomic sequence. The reverse primer (2) is 21 bp of linker sequence that is 
antiparallel to the linker sequence in the forward primer (3) for amplifying 
the antibiotic resistance cassette, followed by 22 bp of homology to the 
genomic sequence. For the 3' flank, the forward primer (5) is 21 bp of linker 
sequence that is antiparallel to the linker sequence in the reverse primer (6) 
for amplifying the antibiotic resistance cassette, followed by 22 bp of 
homology to the genomic sequence. The reverse primer is 22 bp of exact 
homology to the genomic sequence. Table 33.2 contains the linker 
sequences we use to design these primers. 

Primers 1—6 are used first for amplification of the 5'- and 3 / -flanks and 
the antibiotic resistance cassette. Then primers 1 and 6 are used in the fusion 
PCR to amplify the full-length linear construct. 

5.1.1.1. Conditions for amplification of 5' and 3' flanks and antibiotic 

resistance cassette (50 /xl final volume): 400 nM each primer, 0.25 mM 
dNTPs, 20 mMTris-HCl (pH 8.8), 2 mMMgS0 4 , 10 mMKCl, 10 mM 
(NH 4 ) 2 S0 4 , 0.1% (v/v) Triton X-100, 0.01% (w/v) BSA (98% electropho- 
resis grade, Sigma, Cat. No. A7906), 5% (v/v) DMSO, 2.5 U Pfu polymer- 
ase, 0.5 fA of template DNA (genomic DNA at 1 jJLg/ jA or plasmid bearing 
antibiotic resistance cassette at 30 ng//il). 

We maintain at4°Cal0x stock of PCR buffer that contains 200 mM 
Tris-HCl (pH 8.8), 20 mMMgS0 4 , 100 mMKCl, 100 mM (NH 4 ) 2 S0 4 , 
1% (v/v) Triton X-100, 0.1% (w/v) BSA. We then add the primers, 
DMSO, dNTPs, Pfu, and template DNA separately. 

PCR conditions, performed on a PTC-200 Peltier Thermal Cycler (MJ 
Research): 93 °C for 3 min, followed by 35 cycles of (93 °C for 30 s, 45 °C 
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Figure 33.2 (A) Targeted gene deletion: The construct contains an antibiotic resistance 
cassette flanked on the 5'- and 3'-ends with 1 kb regions of homology upstream and 
downstream of the targeted gene. (B) Epitope-tagging: The construct contains the 
epitope tag and an antibiotic resistance cassette flanked on the 5'-end with a 1-kb 
region of homology to the 3'-end of the targeted gene, and on the 3'-end with a 
1-kb region of homology to the region immediately downstream of the targeted 
gene. (C) Promoter replacement: The construct contains an antibiotic resistance cassette 
and the desired promoter region flanked on the 5'-end with a 1-kb region homologous 
to the sequence upstream of the promoter to be replaced, and on the 3'-end with a 1-kb 
region homologous to the 5'-end of the targeted gene. NAT , nourseothricin resis- 
tance cassette; YFG, targeted gene. 
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Table 33.2 Primers for construction of targeted gene deletion construct by 
fusion PCR 



Primer 
1 



3 



4 



Sequence 

Forward primer to 5 r flank: 22 bp of sequence 1 kb upstream 

ofORF 
Reverse primer to 5 r flank: 

CACGGCGCGCCTAGCAGCGGA-22 bp of sequence 

immediately upstream of ORF 
Forward primer to antibiotic resistance cassette: 

CCGCTGCTAGGCGCGCCGTGA-22 bp of sequence 

at 5 f end of antibiotic resistance cassette 
Reverse primer to antibiotic resistance cassette: 

GCAGGGATGCGGCCGCTGACA-22 bp of sequence 

at 3' end of antibiotic resistance cassette 
Forward primer to 3 r flank: 

GTCAGCGGCCGCATCCCTGCA-22 bp of sequence 

immediately downstream of ORF 
Reverse primer to 3 f flank: 22 bp of sequence 1 kb downstream 

of ORF 



The linker sequences are not exactly antiparallel with each other; you will note that all linker 

sequences in primers 2'— 5' end in an adenine prior to the 22 bp of homologous sequence. 

Taq polymerase exhibits terminal transferase activity, which adds an additional adenosine onto the 

3' ends of PCR products. Therefore, the extra adenine in the primers allows for perfect 

homology between the linker sequences during the actual fusion PCR that fuses the three 

fragments together into the targeted gene deletion construct. The sequences for these primers 

are listed 5'— 3'. 

The linker sequences were adapted from previous work Reid et al. (2002). 



for 30 s, 72 °C for 3.5 min or appropriate amount of time for the length of 
your antibiotic resistance cassette), followed by 72 °C for 5 min. 

Purify the PCR products by running the PCR out on a 0.8% agarose 
gel. Cut out the appropriate size band in the gel, and purify using a 
QIAquick Gel Extraction kit (Qiagen, Cat. No. 28704), following the 
manufacturer's instructions. In the final step, elute from the column with 
30 /il of elution buffer (EB), letting the column stand for 1 min then 
centrifuging for 1 min at 13,000 rpm. 



5.1.1.2. Conditions for fusion PCR (50 [il final volume): 400 nM each 
primer (primers 1 and 6), 0.25 mM dNTPs, 10 mM Tris— HC1 (pH 8.3), 
50 mMKCl, 2 mMMgCl 2 , 1.3 Mbetaine, 1 U Taq polymerase, 0.25 U Pfu 
polymerase, 50 nmol each of 5' flank, 3' flank, and antibiotic resistance 
cassette (roughly equal to 2 fA of the eluted volume from the QIAquick Gel 
Extraction) . 
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We maintain at4°Cal0x stock of PCR buffer that contains 100 mM 
Tris-HCl (pH 8.3), 500 mMKCl, 20 mMMgCl 2 . We then add the betaine, 
dNTPs, Pfu, and Taq polymerases separately. Betaine is maintained as a 5 M 
stock at 4 °C. 

PCR conditions, performed on a PTC-200 Peltier Thermal Cycler (MJ 
Research): 72 °C for 10 min, 92.5 °C for 3.5 min, followed by 35 cycles of 
(92.5 °C for 12 s, 52 °C for 12 s, 72 °C for 7 min or appropriate amount of 
time for the length of the full targeted gene deletion construct), followed by 
72 °C for 5 min. 

Purify the PCR product by running the PCR out on a 0.8% agarose gel. 
Cut out the appropriate size band in the gel, and purify the DNA in the gel 
slice using a QIAquick Gel Extraction kit, following the manufacturer's 
instructions. In the final step, elute from the column with 50 jA of EB, 
letting the column stand for 1 min, then centrifuging the column for 1 min 
at 13,000 rpm. Add another 30 jA of EB to the column, let stand for 1 min, 
then centrifuge for 1 min at 13,000 rpm. 

With modifications, fusion PCR may be utilized for a variety of genetic 
manipulations. By selecting different sequences for amplification, we have 
successfully used fusion PCR to introduce epitope tags into the 5'- and 3'- 
ends of genes (Fig. 33. 2B), and to replace the promoters of genes (e.g., for 
overexpression of genes or placing genes under the control of inducible 
promoters) (Fig. 33. 2C). 

5.2. Transformation 

The preferred method for transformation into C. neoformans is biolistic 
delivery. While studies have shown that C. neoformans is transformable by 
electroporation, these transformations have been low efficiency, resulting in 
some stable ectopic transformants but also many unstable transformants 
harboring extrachromosomal DNA material (Edman, 1992; Edman and 
Kwon-Chung, 1990). In addition, electroporation of different C. neoformans 
strains has varying degrees of success; strain H99 (serotype A) is much less 
tractable to transformation in this way than strain B-3501 (serotype D). In 
contrast, both serotypes A and D are readily transformed with relatively high 
efficiency by biolistic delivery (Davidson et ah, 2000; Toffaletti et ah, 1993). 
In biolistic delivery, DNA is introduced into the yeast cell using a 
biolistic particle delivery system (PDS-1000/He, Bio-Rad) hooked up to 
a vacuum pump (Maxima C Plus M6C, Fisher Scientific) and compressed 
helium tank (Fig. 33.3). The targeted gene deletion constructs generated by 
fusion PCR (see above) are deposited onto gold bead microcarriers that are 
then positioned on a macrocarrier disk in the main chamber of the biolistic 
PDS. The air in the biolistic PDS is removed by a vacuum pump, and 
helium is pumped into a small chamber (termed the gas acceleration tube) 
positioned above the macrocarrier disk, separated from the main chamber 
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Figure 33.3 (A) Example PDS-1000/He biolistic particle delivery system (Bio-Rad): 
(a) "VENT/HOLD/VAC" toggle switch, (b) "FIRE" toggle switch, (c) gas accelera- 
tion tube/retaining cap, (d) microcarrier launch assembly, and (e) plate holder. 
(B) Close-up of microcarrier launch assembly for biolistic particle delivery system: 
(d) microcarrier launch assembly, (f) Top to microcarrier launch assembly, 
(g) Macrocarrier holder. (C) Example of a YPAD plate immediately following biolistic 
transformation. Note the scattering of microcarriers in the center of the patch of 
C. neoformans cells. 



by a pressure-calibrated rupture disk. At a high enough pressure, the rupture 
disk breaks and the helium blasts into the main chamber of the biolistic 
PDS, propelling the macrocarrier disk downward against a metal stopping 
screen. The force of the impact against the stopping screen propels the 
DNA-coated microcarriers off the macrocarrier disk at high velocity and 
into C. neoformans cells that have been plated onto an agar plate and 
positioned below. 



5.2.1. Protocol for biolistic transformation 

Preparation of constructs (may be done anytime prior to the day of 
transformation) 

1. Transfer 30 fA of the purified construct from fusion PCR (see above) 
into a microcentrifuge tube or into one well of a 96-well skirted PCR 
plate (Fisher, Cat. No. 055068). 

2. Concentrate the DNA to the bottom of the tube by removing all 
moisture by SpeedVac. 

3. Add 2.5 fA water, pipetting up and down and around the walls of the 
tube /well multiple times to resuspend the DNA. 

4. Add 12.5 fA of microcarriers (0.6 fim gold beads, Bio-Rad, Cat. No. 
1652262, resuspended in water to 60 mg/ml and maintained at 4 °C). 

5. Add 12.5 fA of 2.5 MCaCl 2 (maintained at 4 °C). Pipette up and down 
to mix. 
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6. Add 5 jA of 0.1 M spermidine (1 M stocks are maintained at —80 °C 
and diluted to 0.1 M in water prior to use). 

7. Mix on a vortexer at low speed for 4 min. We use a Vorex Genie 2 
(Fisher, Cat. No. 12-812) with a platform attachment, set to Vortex 
level 1. If using a skirted PCR plate, cover with a plastic plate seal 
(Qiagen, Cat. No. 1018104). 

8. Collect the microcarriers by centrifugation at 500 rpm for 10 s. 

9. Remove the supernatant by pipette. 

10. Wash microcarriers by adding 50 jA 70% ethanol and immediately 
removing by pipette, being careful to not disturb the microcarrier pellet. 

11. Add 50 jA 100% ethanol and immediately remove by pipette. 

12. Resuspend the DNA-coated microcarriers in 12.5 jA 100% ethanol by 
pipetting up and down. 

13. Transfer all 12.5 jA of microcarriers onto the center of a macrocarrier 
disk (Bio-Rad, Cat. No. 1652335) deposited in a 6-well culture dish 
(Falcon, Cat. No. 35-3224). 

14. Dry the disk until all ethanol has evaporated, leaving a dark gold residue 
on the surface of the macrocarrier. We dry the disk by placing the 
6-well culture dish in a desiccator hooked up to the house vacuum. 

Day one 

1. Inoculate C. neoformans from plate stock into 50 ml liquid YPAD 
medium in a 250 ml flask. Grow with aeration at 30 °C for 2—3 days. 

Day three 

1. Collect C. neoformans culture by centrifugation at 3000 rpm for 10 min. 

2. Resuspend cell pellet in 5 ml regeneration medium (see recipe below). 

3. Pipette 140 jA of resuspended cells onto the center of a YPAD plate, one 
plate for each transformation to be performed. Use a spreader device 
(Marsh Brand, Cat. No. KG-5P) to spread the cells into a circular patch, 
4—5 cm in diameter. 

4. Let the plates dry with lids ajar at 30 °C for 20—30 min. 

5. Perform transformation with biolistic PDS. 

a. Open valve of compressed helium tank. 

b. Turn on vacuum pump and biolistic delivery system. 

c. Unscrew retaining cap at the end of the gas acceleration tube. 

d. Dip rupture disk briefly (2—3 s) in 70% isopropanol, to sterilize the 
disk and aid in its retention in the retaining cap following rupture. 

e. Place the rupture disk in the retaining cap of the biolistic delivery 
system, screw retaining cap back into place. 

f. Press microcarrier-coated macrocarrier (micro carrier-side up) in 
the macrocarrier holder. 

g. Place stopping screen at bottom of the microcarrier launch assem- 
bly of the biolistic delivery system. 
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h. Place the macro carrier holder into the microcarrier launch assem- 
bly, above the stopping screen (microcarrier-side down). Screw 

top on microcarrier launch assembly. 
i. Load microcarrier launch assembly in the first slot from the top in 

the main chamber of the biolistic delivery system. 
j. Load plate holder in the third slot from the top in the main 

chamber of the biolistic delivery system, 
k. Place a YPAD plate with patch of C. neoformans cells from step 3 on 

the plate holder. Remove its lid. 
1. Close the door of the biolistic delivery system, flipping the switch of 

the biolistic delivery system to "VAC" to start drawing air out of the 

chamber. 
m. When the pressure gauge reads more than 27 in. Hg vacuum, flip 

the switch of the biolistic delivery system to "HOLD." 
n. Hold down the "FIRE" button until you hear the rupture disk 

break — it will sound like a loud pop — and the helium pressure 

drops down to zero. 
o. Release the "FIRE" button and flip the switch of the biolistic 

delivery system to "VENT" to release the vacuum from the 

chamber. 

6. Incubate transformed plates at 30 °C for 4 h to allow for recovery. 

7. Resuspend cells in 800 fA phosphate-buffered saline (PBS) using a spreader 
device, then transfer by pipette to a plate containing selective medium. 
Spread the cells over the surface of the plate using a spreader device. 

8. Dry the plates at 30 °C with the lids ajar for 30 min or until dry. 

9. Cover plates and incubate 2—3 days. Colonies should be visible by the 
end of the next day and of a pickable size (~0.5 mm diameter) by the 
second day. These colonies should be picked and patched out onto 
selective medium plates for confirmation of their genotype by PCR. 
We typically patch out the colonies for a verification of the S'-junction. 
The colonies that show successful integration of the construct at the 5'- 
j unction are streaked out to single colonies on selective medium, and 
new colonies are patched out for verification of the 3 / -junction. 

Notes 

1. For concentrating the transformation construct, if using microcentrifuge 
tubes we run them for 30—45 min in a Savant SCI 00 SpeedVac on high 
drying rate. If using a skirted PCR plate, we use a Savant AES2010 
SpeedVac outfitted with plate holders, set to run for 6 h with 45 min of 
radiant cover heating on high drying rate. 

2. Sterilize the macrocarriers and stopping screens prior to use by washing 
in 70% ethanol and drying in a sterile environment. We find a 15-cm 
Petri dish to work well for this purpose. 
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3. We use rupture disks rated between 1100 and 1350 psi (Bio-Rad, Cat. 
Nos. 1652329 and 1652330). Both have given good transformation results. 

4. Following successful transformation, it is usually possible to see a spatter- 
ing of gold beads embedded into the YPAD plate (Fig. 33. 3 C). If this is 
not visible, it is likely that not enough gold beads were used in the 
macro carrier setup. 

5. We find it best to pick the colonies by the end of the second day, because 
one obtains a higher rate at that time of successful transformants that test 
positive for the integration of the drug selection cassette at the targeted 
locus. Waiting until the third day or later allows for false positive 
colonies to catch up in size with true positives. In general, we find it 
best to pick the largest colonies on the plate, although for disruptions in 
genes that positively regulate cell growth, these knockouts can be slower 
growing than some false positives. We typically pick and patch out 6—8 
colonies per transformation, although more may be picked for transfor- 
mations with a lower success rate. 

6. We have observed varying transformation success rates among strains 
that are theoretically genetically identical (i.e., H99 strains from different 
laboratory sources). We hypothesize that in the process of passaging 
these strains, mutations have been acquired that affect homologous 
recombination efficiency. 



Regeneration medium Composition 

0.9% YNB w/o ammonium sulfate w/o dextrose w/o 9 g 

ammonium sulfate (BIO 101, Cat. No. 4027-032) 

1M sorbitol 182 g 

1 Mmannitol 182 g 

2.6% glucose 26 g 

0.267% bacto yeast extract (Becton Dickinson, Cat. 0.27 g 

No. 212720) 

0.054% bacto peptone (Beckton Dickinson, Cat. No. 0.54 g 

211820) 

0.133% Gelatin (Sigma, Cat. No. G-8150) 1.33 g 

Water to 1 1 



5.3. Colony PCR 

The genotype of the transformed strain is verified through PCR-based 
detection of the expected 5'- and 3 / -junctions of the resistance marker 
with the genomic DNA. While verifying the genotypes of many transfor- 
mations, it is easiest and fastest to perform colony PCR. 
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1. Patch out colonies from the transformation into 48- or 96-well grid 
format on plates containing selective medium. 

2. Grow at 30 °C for 2 days. 

3. Using a 48- or 96-well pin replicator, transfer a generous amount of cells 
into 7 fA of water in each well of a PCR plate. 

4. Seal the PCR plate with PCR plate thermal adhesive sealing film. 

5. Flash freeze the PCR plate in liquid nitrogen, then immediately transfer 
to a PCR block set at 100 °C. Incubate for 2—5 min. 

6. Perform PCR as below. 

(50 fil final volume): 400 nMeach primer, 0.25 mMdNTPs, 20 mMTris- 
HC1 (pH 8.8), 2 mMMgS0 4 , 10 mMKCl, 10 mM(NH 4 ) 2 S0 4 , 0.1% (v/v) 
Triton X-100, 0.01% (w/v) BSA (98% electrophoresis grade, Sigma, Cat. 
No. A7906), 5% (v/v) DMSO, 0.15 U Pfu polymerase, 0.5 U Taq 
polymerase. 

We maintain at4°Cal0x stock of PCR buffer that contains 200 mM 
Tris-HCl (pH 8.8), 20 mMMgS0 4 , 100 mMKCl, 100 mM (NH 4 ) 2 S0 4 , 
1% (v/v) Triton X-100, 0.1% (w/v) BSA. We then add the primers, 
DMSO, dNTPs, Pfu, and Taq polymerases separately. 

PCR conditions, performed on a PTC-200 Peltier Thermal Cycler (MJ 
Research): 92.5 °C for 3 min, followed by 35 cycles of 92.5 °C for 15 s, 
45 °C for 15 s, 72 °C for 1 min 45 s or appropriate amount of time for the 
length of targeted amplicon), followed by 72 °C for 5 min. 

Notes 

1 . For adequate DNA recovery, there should be a visible amount of cells in 
the 7 fA of water in the PCR plate prior to flash freezing. 

2. We use a fixed solid pin replicator (V&P Scientific, Cat. No. VP 408H) 
both to mark the selection medium plates on which colonies are patched 
and for the transfer of cells into a PCR plate. We use thin well PCR 
plates from RPI (Research Products International, Cat. No. 141314) 
and TempPlate Sealing Film (USA Scientific, Cat. No. 2921-000) for 
the PCR. 

3. Colony PCR may also be performed in single tube reactions, using a 
sterile toothpick in this case to transfer an appropriate number of cells 
into 7 fA water in a PCR tube. 

4. To verify successful gene deletion, we use primers designed to amplify 
DNA sequences of approximately 1 kb. The verification primers target 
the sequences outside of the region amplified as 5'- and 3 / -flanks for the 
targeted gene deletion construct, and are paired with common primers 
internal to the gene encoding for the drug resistance. 

5. As an additional test for successful gene replacement, it is often useful to 
perform a PCR to the ORF of the targeted gene to confirm its absence 
in the transformed strain. 
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5.4. Genomic DNA extraction 

1. Inoculate 50 ml of YPAD with a C. neoformans. Culture at 30 °C until 
saturation (1—2 days). 

2. Harvest cells by centrifugation in a 50-ml Falcon tube at 3000 rpm for 
10 min. 

3. Remove supernatant, add 30 ml water. 

4. Vortex to mix, then harvest cells by centrifugation at 3000 rpm for 
10 min. 

5. Remove supernatant and flash freeze cell pellet in liquid nitrogen. 

6. Transfer the conical tube containing the cell pellet into a lyophilizer 
vessel. 

7. Attach to a lyophilizer (FreeZone 4.5 Liter Benchtop Freeze Dry 
System, Labconco) connected to a vacuum pump (Maxima C Plus 
M6C, Fisher Scientific). 

8. Lyophilize cell pellet overnight, or until all the liquid has sublimated 
and a dry powdery pellet is left. 

9. Add 3—5 ml of 3 mm glass beads and vortex vigorously until a fine 
powder is created. 

10. Add 10 ml CTAB extraction buffer (100 mM Tris (pH 7.5), 0.7 M 
NaCl, 10 mM EDTA, 1% (w/v) CTAB (hexadecyltrimethylammo- 
nium bromide, Sigma, Cat. No. H6269), 1% (v/v) beta-mercaptoetha- 
nol) and mix. 

11. Incubate at least 30 min at 65 °C. 

12. Add an equal volume of chloroform and mix gently. 

13. Pellet cell debris to the interphase by centrifugation at 3000 rpm for 
10 min. 

14. Transfer aqueous phase to a fresh tube. 

15. Add an equal volume of isopropanol and mix gently. 

16. Pellet DNA by centrifugation at 3000 rpm for 10 min. 

17. Wash DNA pellet with 70% ethanol. 

18. Aspirate out supernatant. Invert tube and allow pellet to dry overnight. 

19. Resuspend DNA in 500 fA TE (100 mMTris (pH 7.5), 1 mMEDTA) 
and transfer to a 1.5-ml microcentrifuge tube. 

20. Add 1 fA RNase (1 mg/ml stock solution), and incubate at least 30 min 
at 37 °C. 

21. Add 5 fA proteinase K (20 mg/ml stock solution), and incubate 2 h at 
55 °C. 

22. Add 500 fA equilibrated phenol (Sigma, Cat. No. P4557). Separate 
phases by centrifugation at 14,000 rpm for 10 min. 

23. Transfer aqueous phase to a new 1.5-ml microcentrifuge tube. Add 
500 fA chloroform. Vortex briefly to mix. Separate phases by centrifu- 
gation at 14,000 rpm for 10 min. 
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24. Transfer aqueous phase to a new 1 .5-ml microcentrifuge tube. Add 1/10 
volume 3 M NaOAc and 2—3 volumes 100% EtOH. Briefly vortex or 
flick to mix. Incubate at — 20 °C for 2 h or — 80 °C for 0.5—1 h. 

25. Centrifuge at 14,000 rpm for 10 min. Aspirate out the supernatant. 

26. Dry the pellet in a SpeedVac concentrator for 2 min. 

27. Resuspend DNA in 500 fA TE. 

Note 

Lyophilization greatly enhances recovery of nucleic acid from C. neofor- 
mans. Dessication may weaken the structure of the polysaccharide capsule 
and cell wall, allowing greater disruption in later steps. 



5.5. RNA extraction 

1. Culture C. neoformans cells in the conditions desired for harvesting 
RNA. 

2. Harvest cultures by centrifugation. 

3. Remove medium and flash freeze cell pellet in liquid nitrogen. 

4. Lyophilize cell pellet until dry. 

5. Resuspend cell pellet in 1 ml TRIzol Reagent (Invitrogen, Cat. No. 
15596018) and transfer to a 2-ml screw-cap microcentrifuge tube 
(Sarstedt, Cat. No. 72.693.005) containing ~200 fA volume of 
0.5 mm zirconia/silica beads (Bio-Spec Products, Cat. No. 11079105z). 

6. Bead-beat at least twice for 2.5-min intervals in a Mini-BeadBeater-8 
(BioSpec Products). 

7. Centrifuge samples at 12,000 Xg for 10 min at 4 °C. 

8. Transfer cleared lysate to a 1.5-ml microcentrifuge tube and add 200 fA 
chloroform. 

9. Vortex for 15 s to mix. 

10. Centrifuge samples at 12,000 Xg for 10 min at 4 °C. 

11. Transfer the aqueous phase to a new 1.5-ml microcentrifuge tube and 
add 500 fA isopropanol. 

12. Briefly vortex and allow to sit at 4 °C for at least 15 min. 

13. Centrifuge samples at 12,000 Xg for 10 min at 4 °C. 

14. Remove supernatant and wash pellets with 1 ml 75% ethanol (prepared 
with RNase-free water). 

15. Vortex to mix and centrifuge sample 10,000 Xg for 5 min at 4 °C. 

16. Remove supernatant and dry pellet by spinning in a SpeedVac con- 
centrator (Savant, Model SCI 00) for 2 min. 

17. Resuspend pellet in 100 fA RNase-free water if performing DNase 
treatment, or 500 fA if not DNase-treating the sample. 
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18. Optional: You may DNase-treat the RNA at this step to remove 
contaminating DNA. 

a. Add 10 //l of 10 x DNase buffer (0.1 M Tris (pH 7.5), 25 mM 
MgCl 2 , 5 mM CaCl 2 , made with RNase-free water and stored at 
-20 °C), and 5 fA (50 U) of DNase I (Roche, Cat. No. 047 716 728 
001). 

b. Incubate at 37 °C for 30 min. 

c. Incubate at 75 °C for 5 min to heat- inactivate the DNase I. 

d. Add 400 fA RNase-free water. 

e. Add 500 fA acid-equilibrated phenol: chloroform (Sigma, Cat. 
No. PI 944). Vortex 15 s to mix. 

f. Let stand at room temperature until phases have separated 
(~ 10 min), then spin at 14,000 rpm for 10 min at 4 °C. 

g. Transfer aqueous phase to a new microcentrifuge tube. 
h. Add 500 fA chloroform. Vortex 1 min to mix. 

i. Spin at 14,000 rpm for 10 min at 4 °C. 
j. Transfer aqueous phase to a new tube. 

19. Add 500 fA chloroform to the samples. 

20. Vortex for 1 min. 

21. Centrifuge samples at 12,000 Xg for 10 min at 4 °C. 

22. Transfer the aqueous phase to a new microcentrifuge tube and add 
15 fA 3 MNaOAc and 900 fA isopropanol. Briefly vortex and allow to 
sit at —20 °C for at least 15 min. 

23. Centrifuge samples at 12,000 Xg for 10 min at 4 °C. 

24. Remove supernatant and wash pellets with 1 ml 75% ethanol (prepared 
with RNase-free water). 

25. Vortex to mix and centrifuge sample 10,000 Xg for 5 min at 4 °C. 

26. Remove supernatant and dry pellet in a SpeedVac concentrator for 
2 min. 

27. Resuspend pellet in 100 fA RNase-free water. 

Notes 

1 . Depending on the growth conditions being assayed, it may be necessary to 
increase the number and length of intervals in the Mini-BeadBeater-8. 
For example, conditions of increased capsule synthesis require upward of 
five 10-min intervals. Experimentation may be required in order to 
determine optimal durations for your growth conditions. 

2. DNase treatment is optional but highly recommended, especially if the 
RNA will later be reverse-transcribed for use in quantitative PCR 
(qPCR). This step appears to be less critical for microarray analysis of 
transcript level, but is nonetheless recommended. 

3. The second round of chloroform extractions (step 19 onward) has in our 
hands led to cleaner RNA extractions that offer greater yields of cDNA 
following reverse transcription. 
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5.6. Protein extraction for SDS-PAGE 

1. Harvest cells in mid-logarithmic growth phase corresponding to 
OD 600 — 2 (e.g., if cells are at OD = 0.5, harvest 4 ml) by 
centrifugation. 

2. Remove medium and resuspend cells in 500 fA ice-cold H 2 0. 

3. Transfer to 2 ml screw-cap microcentrifuge tube (Sarstedt, Cat. 
No. 72.693.005). 

4. Centrifuge samples at 12,000 Xg for 5—10 min. 

5. Remove supernatant and flash freeze cell pellet in liquid nitrogen. 

6. Lyophilize until pellet is dry. 

7. Resuspend pellet in 1 ml ice-cold water. 

8. Add 150 fA NaOH/beta-mercaptoethanol mixture (1.85 N NaOH, 
7.5% (v/v) beta-mercaptoethanol) to each sample. 

9. Incubate on ice with occasional vortexing for 30 min. 

10. Add 150 fA trichloracetic acid (TCA, 55% (w/v) in water kept at 4 °C 
in a foil-wrapped bottle) to each sample. 

1 1 . Incubate on ice with occasional vortexing for 30 min. 

12. Centrifuge samples at 12,000 Xg for 10—20 min at 4 °C. 

13. Remove most of the supernatant. 

14. Optional (when harvesting >1 OD of cells): Add 100 fA ice-cold 
acetone to optimize removal of residual TCA. 

15. Centrifuge samples at 12,000 Xg for 1 min at 4 °C. 

16. Remove the remaining supernatant. 

17. Resuspend pellet in 50 fA HU buffer (200 mM sodium phosphate 
buffer (pH 6.8), 8 Murea, 5% (w/v) SDS, 1 mMEDTA, bromophenol 
blue. Store at — 20 °C and add 100 mMDTT immediately before use). 

Notes 

1. To load samples in HU buffer, care should be taken not to boil them. 
Instead, the samples should be heat-denatured at 65—70 °C for 
10—15 min or at 37 °C for 30 min. 

2. If HU buffer in the resuspended protein pellet turns yellow due to 
residual TCA, add 10-20 fA of 1 MTris (pH 6.8). 




6. Methods for Assaying Pathogenesis 

6.1. Murine model of infection 

Mice are relatively susceptible to C. neoformans infection, when compared 
with other mammalian hosts such as rats and rabbits. Immunocompetent 
murine strains will succumb to pulmonary infection, and will experience 
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dissemination to other organs including the brain, similar to human crypto- 
coccosis. A murine model of cryptococcal infection offers several advan- 
tages over other species, including the consistency of susceptibility within a 
given strain, the availability of genetically modified strains (useful for exam- 
ining host factors that may be involved in infection), as well as their small 
size and low cost. Our laboratory utilizes two routes of infection for 
introducing C. neoformans into a murine model of infection: intranasal and 
intravenous. 

6.1.1. Considerations of murine strain and age 

Inbred mouse strains may vary in their susceptibility to C. neoformans 
infection. For example, in some studies, BALB/c mice have been demon- 
strated to be more resistant to C. neoformans infection than C57BL/6 mice, 
as evidenced by both fungal load in the lungs following infection (Chen 
et ah, 2008; Huffnagle et ah, 1998) and degree of dissemination to other 
organs (Chen et ah, 2008), although in some survival curve analyses, 
C57BL/6 mice survive slightly longer than BALB/c mice following infec- 
tion with C. neoformans (Nielsen et al. , 2005) . There are varying theories as 
to the source of the differences in susceptibility in mouse strains to 
C. neoformans: studies have linked relative resistance of a mouse strain to 
the production of Thl-type cytokines, where their production is associated 
with pulmonary clearance (Huffnagle et al, 1998), or the presence of 
complement protein C5 (Rhodes et al, 1980). Our laboratory performs 
murine infections with 5- to 6-week-old A/J mice, which are C5-deficient 
and therefore slightly more susceptible to C. neoformans infection than 
C5-sufficient mouse strains (e.g., BALB/c and C57BL/6 mice) (Nielsen 
et al, 2005; Wormley et ah, 2007). It bears noting that the inocula listed 
below have been determined by our laboratory for use with our strain of 
H99 C. neoformans in A/J mice. Use of other mouse strains may necessitate 
adjustment of the inocula to a higher or lower dosage of C. neoformans cells. 
Additionally, derivatives of the H99 strain (i.e., H99 stocks maintained by 
different laboratories) appear to have varying levels of virulence. 

Studies have also shown that the age of the mice may affect their relative 
susceptibility to infection. Older C57BL/6 mice (e.g., 17-week-old) are 
better able to clear an intratracheal infection from their lungs, brains, and 
spleens than younger (e.g., 5-week-old) mice (Blackstock and Murphy, 
2004). We have found it best to infect mice of a consistent age to reduce 
variability in our data. 

6.1.2. Intranasal infection 

An intranasal infection is thought to more closely mimic the natural course 
of infection, beginning with the inhalation of C. neoformans cells leading to 
pulmonary disease followed by dissemination to other organs. The mice are 
first anesthetized with a mixture of ketamine hydrochloride (Orion Pharma 
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Animal Health) and medetomidine hydrochloride (Domitor , Orion 
Pharma Animal Health) via intraperitoneal injection. For all murine injec- 
tions, we use X A cc insulin syringes with 28GM> needles (Becton Dickinson, 
329461). The anesthetic is mixed to contain 18.75 mg/ml ketamine hydro- 
chloride and 0.625 mg/ml medetomidine hydrochloride. We administer 
30—50 fA of this formulation to each mouse (10—15 g), leading to doses of 
50—60 mg/kg ketamine hydrochloride and 1.5—2.0 mg/kg medetomidine 
hydrochloride. The mice usually succumb to the anesthesia after 5—10 min, 
at which point they are weighed, their ears notched for later identification, 
and ointment (Artifical Tears, Webster Veterinary, Cat. No. 07-841-4071) 
applied to their eyes to prevent them from drying out. A silk thread (50 
Denier Weight, obtainable from a sewing supply store) is strung between 
two supports — we use ring stands for this purpose. The mice are then 
suspended by their incisors upon this silk thread (Fig. 33. 4A). The inoculum 
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Figure 33.4 (A) Intranasal infection. A silk thread is tied across two supports (such as 
ring stands). The anesthetized mouse is suspended from its front incisors on the thread. 
The inoculum of yeast cells is pipetted down one nare. (B) Intravenous infection. The 
mouse is anesthetized with isofluorane administered by face mask (top view), while the 
tail vein is dilated through a combination of a sodium acetate heating pad from below 
and a heating lamp from above (side view). When the mouse is laid on its side, the 
lateral tail vein of the mouse will be at the top of the tail (top view). 
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of C. neoformans cells (5x10 cells/mouse in 50 /A) is slowly pipetted 
directly into one nare using a pipette fitted with a filter tip. Take care to 
allow for complete dispension of the inoculum into the nare; if signs of 
struggling are seen in the mouse, pipetting should be suspended until the 
mouse no longer shows signs of discomfort. We typically anesthetize and 
inoculate batches of five mice at a time. Following completion of inocula- 
tion, the mice remain suspended for 10 min, to allow for complete aspira- 
tion into the lungs, before being lowered and the anesthesia reversed via 
intraperitoneal injection of atipamezole hydrochloride (Antisedan , Orion 
Pharma Animal Health). We administer 40—50 /il of 1 mg/ml atipamezole 
hyrochloride per mouse, leading to a dose of 2.5— 3.5 mg/kg. It typically 
takes 10—15 min following injection with atipamezole hydrochloride to see 
signs of stirring in the mice, and another 15—20 min before the mice begin 
to walk around again. 

6.1.3. Inocculum preparation 

Inocula are prepared by growing C. neoformans in liquid YPAD overnight at 
30 °C. Cells are counted by hemocytometer and, for an intranasal infection, 
1x10 cells are washed twice with PBS and resuspended in 1 ml of PBS. 
Fifty microliters of this inoculum are used per mouse (5x 10 cells). For an 
intravenous infection, 2x10 cells are washed in PBS and resuspended in 
1 ml of PBS. One hundred microliters of this inoculum is used per mouse 
(2x10 cells). Inocula concentrations are confirmed by plating appropriate 
dilutions onto YPAD plates and counting the colony forming units (CFU) 
after 2 days growth at 30 °C. 

6.1.4. Intravenous infection 

An intravenous infection, via the lateral tail vein, leads to more uniform 
dissemination to the organs. The mice are weighed prior to infection and 
marked by ear notching for later identification. They are anesthetized via 
inhalation of 3% isofluorane in oxygen, administered by face mask, then 
remain on a sodium acetate rechargeable heating pad (Heat Solution, Prism 
Enterprises) beneath a heating lamp during the procedure (see Fig. 33. 4B) in 
order to dilate the vein so that it is more visible for easier injection. The 
inoculum (2x10 cells in 100 fA PBS) is injected into the lateral tail vein. 
Following successful inoculation, the mice are immediately removed to 
their cage where they will rapidly recover from the anesthesia. 

6.1.5. Monitoring disease progression 

Mice are weighed prior to infection, and then monitored every 2—3 days 
postinfection. Signs of disease progression include hunched posture, abnor- 
mal gait, weight loss, and decreased grooming as indicated by ruffled fur. 
Our laboratory uses two endpoints for assessing time of survival: the point at 
which the mouse has lost 15% of its initial weight, or 25% of its peak weight. 
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We find the latter to be more consistent when the mice were infected at a 
younger age (e.g., close to 4 weeks in age) and are hence smaller at the initial 
time point. 

6.1.6. Murine infection evaluations 

"Time-to-endpoint" survival curve analysis monitors the infection of 8— 10 
mice with a single strain of C. neoformans, until their endpoints (as defined 
above). In this manner, mice infected with less virulent strains of C. neoformans 
survive longer than mice infected with more virulent strains (Fig. 33. 5 A). 

This analysis gives a gross determination of the virulence of a single strain 
on the entire host system. A more specific analysis might address questions 
such as the initial rate of colonization to a specific organ, the rate of 
proliferation and/or rate of killing by the host immune cells, or the rate of 
dissemination to other organs. This additional analysis may be performed by 
assessing fungal load in the organs at various time points following infection. 
We typically examine fungal loads in the lungs, brain and spleen, although 
we have also examined the liver and kidneys. To measure organ loads after 
the animal is euthanized, the selected organs are removed by dissection, and 
placed on ice in 17 X 100 mm polypropylene sterile tubes (Evergreen Scien- 
tific, Cat. No. 222-2393-080), one tube per organ per mouse. Take care to 
wash the dissecting tools in water and ethanol between organs to eliminate 
carryover of yeast from organ to organ. Each organ is homogenized in 5 ml 
sterile PBS (we use a PRO200 tissue homogenizer, PRO Scientific, 
Oxford, CT), then serial dilutions in PBS are plated on Sabouraud dextrose 
agar plates (made with Sabouraud dextrose agar, Becton Dickinson, Cat. 
No. 211661) containing 40 /ig/ml gentamycin and 50 /ig/ml carbenicillin 
to discourage bacterial growth. CFU are assessed, and comparisons can be 
made for a single strain in different organs, rate of growth in different organs 
over time, or between multiple strains for relative fitness. 

6.1.7. Signature-tagged mutagenesis screening 

Evaluation of infectivity and virulence for a large number of C. neoformans 
strains through single-strain infections as described above can quickly add up 
in terms of both time and cost, as many mice must be used for each strain. 
Pooling mutant strains into a single infection allows rapid assessment of 
multiple strains in a single mouse. This can be easily and effectively per- 
formed using a technique known as signature-tagged mutagenesis (STM) 
screening. Each mutant contains a signature tag, or a unique sequence similar 
to a barcode, in its DNA. When pooled together in a group, individual 
strains can still be identified through qPCR of pooled genomic DNA using 
signature-tag-specific primers. By identifying relative representation in the 
pool of genomic DNA before and after infection, relative rates of infectivity 
can be assessed rapidly and reproducibly for multiple mutants in a single 
infection. Using 48 unique signature tag sequences, this technique has been 
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Figure 33.5 (A) Example of a survival curve. Mice were inoculated via tail-vein 
injection with 2 x 10 cells/mouse of either WT (H99) or srelA strains of C. neofor- 
mans. On average, mice infected with srelA survived 30 days longer, indicating srelA 
that is a hypovirulent strain. (B) Example of STM score data. Forty-eight signature- 
tagged strains were grown individually in liquid YPAD medium in a 96-well deep 
pocket plate, then pooled together to generate the inoculum. Three mice were inocu- 
lated with 5x10 cells/mouse via intranasal infection. The mice were monitored to 
the disease endpoint, at which point they were sacrificed. Shown are a subset of the data 
from the lungs, following qPCR and calculation of the STM score for each signature tag 
in each mouse. 
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employed by our laboratory for the production and quantitative analysis of a 
library containing ^1200 targeted gene deletion strains (available without 
restriction from the Fungal Genetic Stock Center or the American Type 
Culture Collection (ATCC)) (Liu et al, 2008). 

In detail, to analyze a group of 48 signature-tagged strains, the group is 
first grown up in liquid YPAD in 96-well deep-pocket plates (Grenier Bio- 
One, Cat. No 780270), one strain per well, at 30 °C without shaking for 
3 days. Two hundred microliters of each culture is pooled together, and the 
number of cells assessed by hemocytometer. 2x10 cells (for tail vein 
injection) or 1x10 cells (for intravenous infection) are washed twice in 
sterile PBS and resuspended in 1 ml sterile PBS. This pool is used as the 
inoculum to infect three mice, either by intranasal infection (5x10 cells/ 
mouse) or tail vein injection (2x 10 cells/mouse). Fifty microliters (5x 10 
cells) of this pool is also plated in triplicate on Sabouraud dextrose agar plates 
containing 40 /ig/ml gentamycin and 50 /ig/ml carbenicillin, which are 
then incubated at 30 °C for 2 days. The resulting colonies are scraped off 
each plate, resuspended in water, flash frozen in liquid nitrogen and then 
lyophilized. Genomic DNA is prepared from these samples as described 
above. This DNA constitutes the "input DNA" for later analysis. 

After monitoring and sacrifice of the animals, the organs of interest are 
removed and homogenized in 5 ml sterile PBS. Serial dilutions in triplicate 
are made in sterile PBS and plated on Sabouraud dextrose agar plates 
containing 40 /-(g/ml gentamycin and 50 /ig/ml carbenicillin. These plates 
are incubated at 30 °C for 2 days. The resulting colonies are scraped off each 
plate, resuspended in water, flash frozen in liquid nitrogen, and then 
lyophilized. The genomic DNA that is prepared from these samples con- 
stitutes the "output DNA" of the experiment. 

The input and output DNA are analyzed using qPCR using a common 
primer targeted to the drug resistance marker that has replaced the targeted 
gene, coupled with signature tag-specific primers. 

6.1.7.1. STM qPCR conditions (50 /xl final volume): 400 nM each primer, 
0.25 mMdNTPs, 10 mMTris-HCl (pH 8.3), 50 mMKCl, 2 mMMgCl 2 , 
1.3 M betaine, 1 U Taq polymerase, 0.25 U Pfu polymerase, 1—4 /ig 
genomic DNA, 2 fil 2x Sybr Green I (Molecular Probes, Cat. No. S-7563). 

We maintain at4°Cal0x stock of PCR buffer that contains 100 mM 
Tris-HCl (pH 8.3), 500 mMKCl, 20 mMMgCl 2 . We then add the betaine, 
dNTPs, Sybr Green I, Pfu, and Taq polymerases separately. Betaine is 
maintained as a 5 M stock at 4 °C. Sybr Green I is kept at —20 °C as a 
100 X stock in DMSO, and diluted 1:50 in TE buffer to 2x stock immedi- 
ately prior to addition to the PCR mix. 

PCR conditions, performed on a DNA Engine Opticon (MJ Research): 
93 °C for 4 min, followed by 40 cycles of (93 °C for 45 s, 52 °C for 25 s, 72 °C 
for 1 min, then a plate read by the machine), followed by 72 °C for 5 min. 
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6.1.7.2. Calculating STM score The threshold cycle (C T ), or cycle num- 
ber where the amplified target reaches a fixed threshold, of each primer pair 
is used to calculate an STM score, using a variation of the 2 T method 

for quantitation analysis (Livak and Schmittgen, 2001). For each signature 
tag, a AC T is calculated by subtracting the C T of the specific primer pair 
from the median C T for all 48 pooled strains to (AC T = C T _ mec jian — C T _ tag ). 
The AC T values for each of the three independent input DNA samples are 
averaged to calculate the AC T _ input value. The AC T values for each of the 
three independent output DNA samples is similarly calculated by subtracting 
the median C T for all 48 pooled strains to the C T of the specific primer pair. 
However, AC T values (AC T _ output ) for the three independent output 
DNA samples are not averaged. The value AAC T is then calculated, where 
(AAC T = AC T _ output — AC T _ input ). The STM score is then equal to AAC T . 
The STM scores from the three mice (i.e., each of the three output DNA 
samples) are then averaged to determine a final STM score for each mutant. 
Strains with reduced levels of persistence in the organ have STM scores less 
than 0, while strains with increased levels of persistence have STM scores 
greater than (Fig. 33. 5B). The STM score correlates with the relative fold 
change in persistence of a strain with respect to wild type. 

This method of analysis makes the basic assumption in its normalization 
of the data that most of the signature-tagged strains in the pooled infection 
will have phenotypes similar to wild type. If you desire to assay a significant 
number of strains that you believe to have different survival rates than wild- 
type C. neoformans in the mouse, you may need to also seed the inoculum 
pool with signature-tagged strains that are known to have a wild-type 
phenotype to prevent skewing during the normalization process. We fre- 
quently use knockouts in the gene SXI1 (CNAG_06814 in the H99 
sequence database of the Broad Institute) in this manner, as SXI1 is required 
for mating but dispensable for virulence (Hull et ah, 2004). 

It is important to note that this screen assays for relative persistence of a 
strain within a particular organ. It is not a true test of virulence per se, as it 
is conceivable that a strain may persist in large numbers in a tissue but fail 
to cause disease in the host. However, in our experience (Liu et ah, 2008), 
the STM screen, when used to assay persistence of mutant strains in the 
lungs of 5-week-old A/J mice following intranasal infection, has resulted in 
STM scores that are both reproducible from mouse-to-mouse and pool-to- 
pool, but are also to a certain degree quantitative, by which we mean that 
the relative value of the STM score correlates with relative hypo- and 
hypervirulent phenotypes of the strains when assayed by survival curve 
analysis. 

STM screens also cannot avoid in trans effects from mixing of strains; 
theoretically a wild-type strain may complement the phenotype of a mutant 
strain, allowing for a false negative result. Single-strain infections bypass this 
limitation of the STM screen approach. 
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6.2. Tissue culture 

Although analysis of virulence in the host organismal level offers obvious 
correlations between a particular genotype and its efficacy at disease devel- 
opment, it is often problematic to determine the specific host— pathogen 
interactions responsible for a certain virulence phenotype. It is therefore 
useful to examine in closer detail the interaction of C. neoformans with a 
particular host tissue or cell type. 

Many studies of the virulence of C. neoformans have focused on its interac- 
tions with the immune system, and, in particular, its interactions with macro- 
phages. Alveolar macrophages are thought to be the first line of defense against 
pulmonary cryptococcal infection. Macrophages and macrophage-derived 
cells have been observed in the periphery of cryptococcal-containing granu- 
loma formations in the lungs during latent infection of immunocompetent 
hosts. Additionally, depletion of macrophages from the murine host through 
the administration of silica has proven to be detrimental to fungal clearance 
(Monga, 1981). C. neoformans mutants that are more susceptible to killing by 
macrophages are hypovirulent in "time-to-endpoint" survival curves (e.g., 
FHB1 which encodes for flavohemoglobin; de Jesus -Berrios et al, 2003). For 
these and other reasons, it is of interest to examine the interaction of macro- 
phages and macrophage-like cells with C. neoformans yeast. 

Unopsonized C. neoformans cells are rarely taken up by macrophages in 
the absence of activation by cytokines such as IFN-y or potent antigens such 
as lip op oly saccharide (LPS). Therefore, phagocytosis assays and assessments 
of killing by macrophages are commonly done in the presence of both 
opsonins (such as anti-C. neoformans antibodies or murine or human sera) 
and activating agents. We most commonly use the murine macrophage-like 
cell line RAW264.7 (American Type Culture Collection, No. TIB-71), 
and have had better success using anti-C. neoformans antibody than sera as an 
opsonizing agent. 



6.2.1. Assay for killing of C. neoformans by macrophages 

1. Seed RAW264.7 macrophages overnight into 96-well tissue culture 
plates (Corning, Cat. No. 3598) in 200 }A RAW cell medium (high- 
glucose DMEM (UCSF Cell Culture Facility, Cat. No. CCFAA005), 
20 mM HEPES/NaOH buffer (pH 7.4) (UCSF Cell Culture Facility, 
Cat. No. CCFGL001), 20 mMglutamine (UCSF Cell Culture Facility, 
Cat. No. CCFGB002)) with IFN-y (100 U/ml, Millipore, Cat. No. 
005) at a density of 5xl0 5 cells/well. 

2. Culture the strain(s) of C. neoformans in 5 ml YPAD medium overnight 
at 30 °C. 

3. The following day, wash an aliquot of the overnight C. neoformans 
culture 3x in sterile PBS. 
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4. Resuspend the C. neoformans cells to a concentration of 10 cells/ml. 

5. Remove the RAW medium from the macrophages and replace with 
200 jA of fresh RAW medium with 30 ng/ml LPS (Sigma, Cat. No. 
L4391), 100 U/ml IFN-y, and anti-C. neoformans antibody. 

6. Add 10 fA of the C. neoformans cells (10 cells) to the macrophages, and 
10 fA to a well containing only 200 fA of RAW medium. Incubate 
for 24 h. 

7. Remove supernatant from the wells, and retain for plating. 

8. Lyse the macrophages by adding 0.01% SDS to each well. Wait 15 min, 
then remove and add to the supernatant previously removed. Repeat at 
least three times. 

9. Check for complete lysis of the macrophages via a microscope. 

10. Dilute the collected supernatants and plate for CFU. Determine the rate 
of killing by the macrophages by comparing the CFU from the wells 
with macrophages to the CFU from the wells without macrophages. 



6.3. Assays for characterized virulence factors 

C. neoformans has a number of characteristics previously shown to be 
involved in its virulence. These include (1) ability to grow at 37 °C (2) 
melanization, thought to aid in resistance to host killing (Nosanchuk and 
Casadevall, 2003), and (3) polysaccharide capsule formation, thought to be 
involved in host immune system evasion (Del Poeta, 2004), (Monari et ah, 
2006). Below are methods to test the relative efficiency of a strain for 
production of the virulence factors melanin and capsule. 

6.3.1. Melanization 

Melanization, or the ability for the yeast to form dark pigment compounds 
from catecholamine substances such as l-DOPA (3,4-dihydroxy-L-phenyl- 
alanine) by the enzyme laccase (Lacl), has long been associated with 
C. neoformans virulence. It has been hypothesized that melanin protects 
the yeast from oxidative or nitrosative damage originating from the host 
cells. To test strains for melanization, we utilize plates containing 100 ng/ml 
l-DOPA. 

l-DOPA plates Composition 

2% Difco Bacto Agar (Becton-Dickinson, Cat. 20 g 

No. 214030) 

7.6 mM L-asparagine monohydrate 1 g 

5.6 mM glucose 1 g 

22 mM KH 2 P0 4 3 g 

1 mM MgS0 4 - 7H 2 250 mg 

0.5 mM l-DOPA (Sigma, Cat. No. D9628) 100 mg 
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0.3 mM thiamine— H CI 
20 nM biotin 
Water 



1 mg 
to 1 1 



To make 1 1 of l-DOPA plate medium, autoclave 20 g of Difco Bacto 
Agar in 900 ml water so that it dissolves. In 100 ml water, add L-asparagine, 
glucose, KH 2 P0 4 , MgS0 4 - 7H 2 0, and l-DOPA in the amounts indicated in 
the above recipe. Add phosphoric acid to the medium to pH 5.6, then add 
thiamine— H CI and biotin. Mix with the dissolved agar, and pour into plates. 



6.3.1.1. Melanization test protocol 

1. Inoculate cultures into YPAD from colonies on a plate for growth 
overnight. 

2. Measure the optical density (OD) by spectrophotometer for each culture 
to be tested. 

3. Dilute the cultures to the equivalent of OD 600 = 0.6 with PBS and array 
in a 9 6- well assay plate. 

4. Spot 4—6 jA of each diluted strain onto an l-DOPA plate. 

5. Incubate for 2—5 days at 30 or 37 °C, under observation (Fig. 33. 6 A). 



1. 



Note 

We have observed that the kinetics of melanization differ between 
growth at 30 and 37 °C, with a greater range of phenotypes visible at 
30 °C. If screening a large numbers of strains, growth at 37 °C may be 
useful to highlight the mutants with more extreme defects in melaniza- 
tion. In addition, it may be useful to monitor the degree of melanization 



Day 



B 



30 c 



WT 



laclA 



37 c 



WT 



laclA 





Figure 33.6 (A) Melanin assay: The kinetics of melanization varies depending on the 
incubation temperature. laclA is deficient in the primary laccase enzyme responsible for 
melanization in C. neoformans. (B) Capsule formation assay: WT (H99) cell grown under 
capsule inducting conditions (DMEM, 37 °C, 5% C0 2 ) and visualized with India ink. 
Bar denotes 10 /im. 
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at early time points, as we have observed some strains that begin to 
melanize later than wild type, but reach a similar final level of melaniza- 
tion after 3 days. 

6.3.2. Capsule formation 

Secretion of a polysaccharide capsule is one of the major virulence factors of 
C. neoformans. When C. neoformans is mixed with India ink, the particles of 
the ink are excluded by a network of capsule fibers, thereby producing a 
characteristic halo around the yeast cell (Fig. 33. 6B). Capsule is produced at 
low levels in typical YPAD culture — for best visualization, capsule production 
must be induced. Capsule can be induced through two different methods, 
described as follows. 

Capsule induction via low nutrient conditions: 

1. Inoculate C. neoformans from a colony on a plate into liquid Sabouraud 
dextrose medium. 

2. Grow overnight at 30 °C. 

3. Dilute the culture 1/100 in 10% Sabouraud dextrose medium buffered 
to pH 7.3 with 50 mMMOPS. 

4. Grow cultures at 30 °C for 2 days in a rotating drum. 

Capsule induction via carbon dioxide exposure: 

1 . Inoculate C. neoformans from a colony on a plate in liquid YNB medium. 

2. Grow overnight at 30 °C. 

3. Count cells on hemocytometer. 

4. Wash 2xl0 7 cells three times with PBS. 

5. Resuspend the cells in 2.5 ml DMEM in a 6-well tissue culture dish 
(Falcon, Cat. No. 35-3224). 

6. Culture cells for 24 h at 37 °C with 5% C0 2 . 

Visualization of capsule by India ink staining: 

1. Collect cells grown in capsule-inducing conditions. Concentrate the 
cells by centrifugation for ease of viewing if necessary. 

2. Add 4 jA India ink (obtainable from a stationery store) to 20 jA of culture. 

3. Drop 2 fA onto a microscope slide, mount with coverslip glass. 

4. Visualize on a microscope with 60 X— 100X objective. 




7. Concluding Remarks 

While the methods described here are by no means all inclusive for 
what can be accomplished with C. neoformans, we hope they provide a 
guide for working with this basidiomycete, as well as a starting point for 
adapting other techniques. 
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Abstract 

The sequence of Saccharomyces cerevisiae enabled systematic genome-wide 
experimental approaches, demonstrating the power of having the complete 
genome of an organism. The rapid impact of these methods on research in yeast 
mobilized an effort to expand genomic resources for other fungi. The "fungal 
genome initiative" represents an organized genome sequencing effort to promote 
comparative and evolutionary studies across the fungal kingdom. Through such 
an approach, scientists can not only better understand specific organisms but also 
illuminate the shared and unique aspects of fungal biology that underlie the 
importance of fungi in biomedical research, health, food production, and industry. 
To date, assembled genomes for over 100 fungi are available in public databases, 
and many more sequencing projects are underway. Here, we discuss both exam- 
ples of findings from comparative analysis of fungal sequences, with a specific 
emphasis on yeast genomes, and on the analytical approaches taken to mine 
fungal genomes. New sequencing methods are accelerating comparative studies 
of fungi by reducing the cost and difficulty of sequencing. This has driven more 
common use of sequencing applications, such as to study genome-wide variation 



Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA 

Methods in Enzymology, Volume 470 © 2010 Elsevier Inc. 

ISSN 0076-6879, DOI: 10.1016/S0076-6879(10)70034-3 All rights reserved. 

833 



834 Christina A. Cuomo and Bruce W. Birren 

in populations or to deeply profile RNA transcripts. These and further technologi- 
cal innovations will continue to be piloted in yeasts and other fungi, and will 
expand the applications of sequencing to study fungal biology. 




1. Introduction: Yeast Genomes and Beyond 

Saccharomyces cerevisiae was the first eukaryote to be sequenced (Goffeau 
et al., 1996). The ensuing genome-wide functional studies (Aparicio et ah, 
2004; Boone et al, 2007; DeRisi et al, 1997; Washburn et al, 2001; Zhu 
et al, 2001) using this sequence established the power of genomic 
approaches in providing a comprehensive understanding of higher organisms. 
Subsequent sequencing of a number of close relatives of 5. cerevisiae demon- 
strated the additional value of comparative analysis in identifying genes, gene 
families, regulatory elements, signatures of selection, and the molecular 
mechanisms that underlie genome evolution. When advances in "whole 
genome shotgun" sequencing methods reduced the cost and organizational 
barriers that had precluded sequencing the larger genomes of filamentous 
fungi, comparative studies of the fungal kingdom began. 

The fungal genome initiative represents a concerted effort by fungal 
researchers and sequencing groups around the world to use genome 
sequencing and comparative analysis to understand individual fungi that 
are important in research, health, and industry and the biological innovations 
that support fungal lifestyles. The fungal kingdom has now been extensively 
sampled by sequencing with ~110 genomes assemblies representing 80 
species submitted to NCBI to date (Table 34.1) and another ~60 species 
targeted for sequencing (http://www.genomesonline.org). 

Comparative genome analysis has provided insights into the genetic 
differences between groups of fungi and the genomic changes that shaped 
their evolution. Comparing genome sequences between fungi over a wide 
range of phylogenetic distances permits identification of ancient and very 
recent molecular innovations, along with core conserved gene sets among 
all fungi or specific subsets of related fungi. In this chapter, we review what 
has been learned from fungal genome sequencing, with a focus on compar- 
isons that have informed our understanding of Saccharomyces and the general 
evolutionary principles that have helped define the fungal kingdom. With 
the advent of next-generation sequencing methods and the resulting accel- 
eration in genome sequencing, we expect comparative genomics will 
become an increasingly common tool in yeast genetic analysis. 

Within the fungal kingdom, the Saccharomycotina are the most highly 
sequenced subphylum (Table 34.1). The focus on sequencing this group of 
organisms reflects the small size of these genomes relative to other fungi and 
other model systems, the carefully described phylogeny of closely related 



Table 34.1 Fungal genome assemblies in GenBank and EMBL-Bank (Data updated from www.broadinstitute.org, www.jcvi.org, 
www.genolevures.org, www.genome.jgi-psf.org, www.genome.wustl.edu, www.ebi.ac.uk, www.yeastgenome.org, www.candidagenome. 
pr g r agd.vital-it.ch r and www.aspergiUusgenome.org/) 



Genome 


Size (Mb) 


Genes 


Strains 


Phylum 


Subphylum 


Ashbya gossypii 


9.6 


4726 


1 


Ascomycota 


Saccharomycotina 


Candida albicans 


14.3-14.4 


6094-6160 


2 


Ascomycota 


Saccharomycotina 


Candida glabrata 


12.3 


5202 


1 


Ascomycota 


Saccharomycotina 


Candida guilliermondii 


10.6 


5920 


1 


Ascomycota 


Saccharomycotina 


Candida lusitaniae 


12.1 


5941 


1 


Ascomycota 


Saccharomycotina 


Candida parapsilosis 


13.1 


5733 


1 


Ascomycota 


Saccharomycotina 


Candida tropicalis 


14.6 


6258 


1 


Ascomycota 


Saccharomycotina 


Debaryomyces hansenii 


12.2 


6272 


1 


Ascomycota 


Saccharomycotina 


Kluyveromyces lactis 


10.6 


5076 


1 


Ascomycota 


Saccharomycotina 


Kluyveromyces waltii 


10.9 


10,721 


1 


Ascomycota 


Saccharomycotina 


Lodderomyces elongisporus 


15.5 


5802 


1 


Ascomycota 


Saccharomycotina 


Pichia stipitis 


15.4 


5841 


1 


Ascomycota 


Saccharomycotina 


Pichia pastoris 


9.4 


5313-5450 


2 


Ascomycota 


Saccharomycotina 


Saccharomyces bayanus 


10.2-11.9 


11,992 


2 


Ascomycota 


Saccharomycotina 


Saccharomyces castellii 


11.4 


4700 


1 


Ascomycota 


Saccharomycotina 


Saccharomyces cerevisiae 


10.7-12.3 


5196-5904 


7 


Ascomycota 


Saccharomycotina 


Saccharomyces kluyveri 


11.3 


5321 


1 


Ascomycota 


Saccharomycotina 


Saccharomyces kudriazevii 


11.4 


nd 


1 


Ascomycota 


Saccharomycotina 


Saccharomyces mikatae 


12.6 


10,311 


1 


Ascomycota 


Saccharomycotina 


Saccharomyces paradoxus 


12.2 


10,554 


1 


Ascomycota 


Saccharomycotina 


Saccharomyces pastorianus 


22.4 


14,152 


1 


Ascomycota 


Saccharomycotina 


Vanderwaltozyma polyspora 


14.7 


5652 


1 


Ascomycota 


Saccharomycotina 
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Genome 


Size (Mb) 


Genes 


Strains 


Phylum 


Subphylum 


Yarrowia lipolytica 


20.5 




6448 


1 


Ascomycota 


Saccharomycotina 


Alternaria brassicicola 


30.3 




10,688 


1 


Ascomycota 


Pezizomycotina 


Ascosphaera apis 


21.5 




nd 


1 


Ascomycota 


Pezizomycotina 


Aspergillus clavatus 


27.9 




9125 


1 


Ascomycota 


Pezizomycotina 


Aspergillus jlavus 


37.1 




12,074 


1 


Ascomycota 


Pezizomycotina 


Aspergillus fumigatus 


28.8- 


-29.2 


9631-9906 


2 


Ascomycota 


Pezizomycotina 


Aspergillus nidulans 


30.1 




10,560 


1 


Ascomycota 


Pezizomycotina 


Aspergillus niger 


33.9 




14,165 


1 


Ascomycota 


Pezizomycotina 


Aspergillus oryzae 


37.1 




12,079 


1 


Ascomycota 


Pezizomycotina 


Aspergillus terreus 


29.3 




10,406 


2 


Ascomycota 


Pezizomycotina 


Blastomyces dermatitidis 


66.6- 


-75.4 


9522-9555 


2 


Ascomycota 


Pezizomycotina 


Blumeria graminis 


161.4 




nd 


1 


Ascomycota 


Pezizomycotina 


Botrytis cinerea 


42.7 




16,448 


1 


Ascomycota 


Pezizomycotina 


Chaetomium globosum 


34.9 




11,124 


1 


Ascomycota 


Pezizomycotina 


Coccidioides immitis 


27.7- 


-29.0 


10,355-10,608 


4 


Ascomycota 


Pezizomycotina 


Coccidioides posadasii 


25.5- 


-28.6 


9897-10,060 


11 


Ascomycota 


Pezizomycotina 


Colletotrichum graminicola 


51.6 




nd 


1 


Ascomycota 


Pezizomycotina 


Fusarium graminearum 


36.5 




13,332 


1 


Ascomycota 


Pezizomycotina 


Fusarium oxysporum 


61.4 




17,735 


1 


Ascomycota 


Pezizomycotina 


Fusarium verticillioides 


41.8 




14,179 


1 


Ascomycota 


Pezizomycotina 


Histoplasma capsulatum 


30.4- 


-38.9 


9233-9532 


5 


Ascomycota 


Pezizomycotina 


Magnaporthe grisea 


41.7 




11,074 


1 


Ascomycota 


Pezizomycotina 


Microsporum canis 


23.2 




8765 


1 


Ascomycota 


Pezizomycotina 


Microsporum gypseum 


23.2 




8876 


1 


Ascomycota 


Pezizomycotina 


Nectria haematococca 


54.4 




15,707 


1 


Ascomycota 


Pezizomycotina 



Neosartorya fischeri 


32.6 


10,407 


1 


Ascomycota 


Pezizomycotina 


Neurospora crassa 


39.2 


9826 


1 


Ascomycota 


Pezizomycotina 


Paracoccidioides brasiliensis 


29.1-32.9 


7876-9136 


3 


Ascomycota 


Pezizomycotina 


Penicillium marneffei 


28.5 


nd 


1 


Ascomycota 


Pezizomycotina 


Pyrenophora tritici-repentis 


37.8 


12,171 


1 


Ascomycota 


Pezizomycotina 


Sclerotinia sclerotiorum 


38.3 


14,522 


1 


Ascomycota 


Pezizomycotina 


Stagonospora nodorum 


37.2 


16,597 


1 


Ascomycota 


Pezizomycotina 


Talaromyces stipitatus 


35.7 


12,449 


1 


Ascomycota 


Pezizomycotina 


Trichoderma atroviride 


36.1 


11,100 


1 


Ascomycota 


Pezizomycotina 


Trichoderma reesei 


34.1 


9129 


1 


Ascomycota 


Pezizomycotina 


Trichoderma virens 


38.8 


11,643 


1 


Ascomycota 


Pezizomycotina 


Trichophyton equinum 


24.1 


8560 


1 


Ascomycota 


Pezizomycotina 


Trichophyton tonsurans 


23.0 


nd 


1 


Ascomycota 


Pezizomycotina 


Uncinocarpus reesii 


22.3 


7798 


1 


Ascomycota 


Pezizomycotina 


Verticillium dahliae 


33.8 


10,535 


1 


Ascomycota 


Pezizomycotina 


Verticillium albo-altrum 


32.8 


10,221 


1 


Ascomycota 


Pezizomycotina 


Schizosaccharomyces japonicus 


11.3 


4814 


1 


Ascomycota 


Taphrinomycotina 


Schizosaccharomyces octosporus 


11.2 


4925 


1 


Ascomycota 


Taphrinomycotina 


Schizosaccharomyces pombe 


12.6 


4962 


1 


Ascomycota 


Taphrinomycotina 


Coprinus cinereus 


36.3 


13,392 


1 


Basidiomycota 


Agaric omycotina 


Cryptococcus neoformans 


17.5-19.1 


6210-6967 


4 


Basidiomycota 


Agaric omycotina 


Laccaria bicolor 


64.9 


20,614 


1 


Basidiomycota 


Agaric omycotina 


Moniliophthora perniciosa 


26.7 


16,329 


1 


Basidiomycota 


Agaric omycotina 


Phanerochaete chrysosporium 


35.1 


10,048 


1 


Basidiomycota 


Agaric omycotina 


Postia placenta 


90.9 


17,173 


1 


Basidiomycota 


Agaricomycotina 


Puccinia graminis f. sp. tritici 


88.6 


20,567 


1 


Basidiomycota 


Pucciniomycotina 


Malassezia globbosa 


8.9 


4285 


1 


Basidiomycota 


Ustilaginomycotina 
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Genome 


Size (Mb) 


Genes 


Strains 


Phylum 


Subphylum 


Ustilago maydis 
Batrachochytrium dendrobatidis 
Spizellomyces punctatus 
Rhizopus oryzae 
Encephalitozoon cuniculi 
Enterocytozoon bieneusi 
Nosema ceranae 


19.7 

23.7 

24.1 

46.1 

2.9 

3.9 

7.9 


6522 

8794 

8804 

17,467 

1997 

3632 

2614 


1 
1 
1 
1 
1 
1 
1 


Basidiomycota 

Chytridiomycota 

Chytridiomycota 

Zygomycota 

Microsporidia 

Microsporidia 

Microsporidia 


Ustilaginomycotina 
Chytridiomycota 
Chytridiomycota 
Mucormycotina 



Notes: Genomes listed are taken from GenBank list of submitted assemblies and EMBL-Bank, as of 09-09-09. 

nd, no data; the Zygomycota phyla was not included in the recent "AFTOL classification" of fungi (Hibbett et ah, 2007). 
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species, the wealth of genetic resources for 5. cerevisiae, and the importance 
of Candida albicans as a pathogen. This group comprises two predominant 
subclades: the Saccharomyces group and the Candida group (Fig. 34.1), which 
diverged ~ 200— 400 mya (Hedges et ah, 2006; Taylor and Berbee, 2006). 
These groups include species specialized to live on different carbon sources 



WGD 

V 



Saccharomycotina 



Ascomycota 



- Saccharomyces cerevisiae 

- Saccharomyces paradoxus 

- Saccharomyces mikatae 
Saccharomyces bayanus 

Saccharomyces castellii 

Candida glabrata 

— Vanderwaltozyma polyspora 
Zy go saccharomyces rouxii 

— Kluyveromyces thermotolerans 

— Kluyveromyces waltii 



Saccharomyces kluyveri 
Ashbya gossypii 



CTG 



Kluyveromyces lactis 

Candida albicans 
■ Candida dubliniensis 
Candida tropicalis 
— Candida parapsilosis 
Lodderomyces elongisporus 



Candida guilliermondii 

Debaryomyces hansenii 
Candida lusitaniae 



Pezizomycotina 



Yarrowia lipolytica 

Fusarium graminearum 

Neurospora crassa 



Sclerotinia sclerotiorum 
Stagonospora nodorum 



Taphrinomycotina 



Basidiomycota 



WGD 



Aspergillus nidulans 

Schizosaccharomyces pombe 

Cryptococcus neoformans 
Malassezia globosa 



0.1 



Rhizopus oryzae 
Batrachochytrium dendrobatidis 



Figure 34.1 Phylogeny of sequenced yeasts and other fungi. Species phylogeny 
relationship of the sequenced yeasts in Saccharomycotina and selected other fungi 
based on 20,000 randomly sampled sites from 294 orthologous protein alignments. 
Tree topology and branch lengths were inferred with RAxML (Stamatakis, 2006); 
branches are well supported (>80% of bootstrap replicates) except as indicated with 
an asterisk. The location of whole genome duplication (WGD) events and change in 
CTG decoding are shown. 
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as well as to live as human commensals and pathogens. Draft genome 
assemblies have been released for 23 Saccharomycotina species 
(Table 34.1). Several were specifically targeted to trace the evolutionary 
history of a whole genome duplication (WGD) event in this group. Also 
recent work described several genomes related to C. albicans which share the 
translation of CTG codons as Serine instead of Leucine (Butler et ah, 2009), 
and representatives of other major Saccharomyces groups which did not 
undergo a WGD (Souciet et ah, 2009). 

Genome sequencing has also targeted additional species within the Asco- 
mycota phylum, of which the Saccharomycotina are one of three subphyla 
(Fig. 34.1). Ascomycota is the largest phylum in the fungal kingdom, and 
these species share a common morphology of forming asci around meiotic 
spores (James et ah, 2006). Many species from the Pezizomycotina subphylum 
of filamentous fungi have also been sequenced, including many animal 
pathogens (including dimorphic fungi, dermatophytes, Aspergillus fumigatus) , 
plant pathogens (Fusarium graminearum, Magnaporthe grisea, Stagonospora 
nodorum), as well as model systems (Aspergillus nidulans and Neurospora crassa). 
Sequenced genomes from the basally branching Taphrinomycotina subphy- 
lum include the Schizosaccharomyces (S. pombe, S. octosporus, and S. japonicus) 
and Pneumocystis. Despite morphological similarities, DNA sequence analysis 
shows that the Schizosaccharomyces are more diverged in evolutionary distance 
from S. cerevisiae than the filamentous Ascomycetes (Fig. 34.1). 

Recent classification of fungi suggests that there are seven phyla in the 
fungal kingdom, including Ascomycota, Basidiomycota, Glomeromycota, 
Microsporidia, and three phyla of Chytrids, as well as four subphyla of the 
Zygomycota group of fungi that are not placed in any phylum (Hibbett 
et ah, 2007). Genome sequencing outside of Ascomycota has been more 
limited, but has sampled the other major groups of the fungal kingdom 
(Table 34.1, Fig. 34.1). Among these groups, the Basidiomycetes are most 
closely related to Ascomycetes and include fungi that span a wide range of 
life cycles. Sequenced Basidiomycetes include long-studied human patho- 
gens, such as the yeasts Cryptococcus neoformans and Malassezia globosa, the 
plant pathogens Ustilago maydis and Puccinia graminis, and important models 
such as the bio degrading fungus Phanerochaete chyrsosprorium and the mush- 
room Coprinus cinereus. Sequencing of Glomus intraradices, a representative of 
the Glomeromycetes, which form symbiotic interactions with plants, has 
been undertaken though the data have proven very difficult to assemble 
(Martin et ah, 2008). Sequencing of the basal Zygomycota and Chytrid 
polyphyletic groups of fungi have been sparse to date. Sequencing of 
Zygomycotas has been limited to just two genomes from the Mucorales 
subphylum; these include Rhizopus oryzae, the major cause of mucormy- 
cosis, and Phycomyces blakesleeanus (http://genome.jgi-psf.org/Phybll/ 
Phybll.home.html). Chytrid fungi, which are characterized by having 
flagella, are subdivided into three phyla; two species from the 
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Chytridiomycota phylum have been sequenced; the fungal pathogen of 
amphibians Batrachochytrium dendrobatidis and the saprobe Spizellomyces pun- 
catus. Genome sequencing for microsporidia, a group diverging from within 
basal fungi (James et ah, 2006), has sampled major pathogens including the 
human pathogen Encephalitozoon cuniculi and the bee pathogen Nosema ceranae. 
Given the tremendous diversity of the Basidiomycetes, Zygomycotas and 
Chytrids, many more sequenced representatives will be required to describe 
and understand the origin or lineage-specific evolution of these groups. 




2. Computational Prediction of Genes 
and noncoding elements 

A primary goal of genome sequencing is a complete gene list for an 
organism. New tools for genome annotation make use of a variety of 
supporting data to improve prediction not only of protein coding genes 
but also RNA-encoding genes and other functional elements. As the genes 
for the Saccharomycotina primarily consist of a single exon, early gene 
prediction methods for sequenced genomes emphasized identification of 
ORFs above a certain size limit. Introns were incorporated into these gene 
predictions based on the presence of conserved splice signals. Although 
effective in identifying the majority of genes, this approach is confounded 
by small genes, which can be impossible to distinguish from randomly 
occurring small ORFs. Simple ORF identification also suffers from the 
inability to correctly distinguish between alternative closely spaced start 
sites. To identify small genes with greater specificity, as well as to confirm 
the more complex gene structures in filamentous fungi where the majority 
of genes are spliced, gene prediction has benefited from evidence of tran- 
scription by EST sequences and hybridization to micro arrays. Transcrip- 
tional data are also crucial in revealing alternative splicing in fungal 
multiexon genes. Analysis of transcriptional data has shown that retained 
introns are the primary mechanism of generating alternate transcripts 
(McGuire et ah, 2008). Application of new sequencing technologies that 
allow the direct sequencing of RNA (RNA-seq) is likely to supplant other 
methods for identifying transcripts and thereby greatly increase the depth 
and resolution of transcript mapping (Nagalakshmi et ah, 2008; Wilhelm 
et ah, 2008) and gene prediction. 

New gene prediction methods are also being developed to exploit 
comparative genomic data. Because genes and other functional elements 
show different patterns of sequence conservation than nonfunctional 
sequences, comparison of closely related species can help pinpoint con- 
served sequences without prior knowledge of their function, including both 
genes and regulatory elements (Hardison, 2003). Comparative annotation 
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requires additional sequences from several species that are each closely 
enough related to the genome being annotated to contain the same genes 
and regulatory mechanisms and allow unambiguous alignment of the 
sequences, while having diverged sufficiently to have accumulated base 
changes in nonselected regions. A fine example of this approach was used 
in the analysis of the 5. cerevisiae genome. To identify species at an appro- 
priate evolutionary distance for comparative annotation of 5. cerevisiae, light 
sequencing was used to evaluate a number of candidate species and an 
informative set was selected for subsequent deeper sequencing (Cliften 
et ah, 2001; Souciet et ah, 2000). Species from the senso stricto group, 
which can form viable diploids with 5. cerevisiae, were sequenced to deeper 
draft coverage, including S. paradoxus, S. mikatae, S. bayanus, and 
5. kudriavzevii (Cliften et ah, 2003; Kellis et ah, 2003). Additional more 
distant species important for these comparisons were also sequenced, 
including 5. castellii and S. kluyveri (Cliften et ah, 2003). 

Through this comparative approach, ~40 new S. cerevisiae genes were 
predicted based on their conservation among the sequenced species. All of 
the newly predicted genes were < 100 amino acids in length, highlighting 
the value of identifying patterns of conservation as an indication of which 
of the many small ORFs are functional and under selective pressure. 
Further, each of these comparative analyses demonstrated that ~500 
predicted genes in S. cerevisiae are likely dubious protein coding genes 
because they are not well conserved among closely related species, and in 
addition not supported by experimental data available at the time. 

Comparative annotation methods continue to undergo improvement. 
For example, methods developed for Drosophila genomes (Lin et ah, 2007) 
were recently used to reannotate the diploid C. albicans (Butler et ah, 2009). 
For whole genome alignments across the Candida clade, measuring codon 
substitution frequencies in addition to reading frame conservation in genie 
regions provided evidence for 91 new or updated genes, and revealed 222 
dubious genes. Dubious genes were manually curated to determine if any 
experimental evidence suggested a functional role; 80% have no current 
evidence. Comparative analysis also identified 226 well-conserved regions 
in C. albicans where an ORF appeared interrupted by a nonsense mutation 
or frameshift. Creating a consensus assembly from sequence of a diploid 
organism that is heterozygous at many loci, can lead to incorrect merging of 
the sequences of alternative alleles. One consequence of these regions of 
mixed origin can be problems with gene structures such as nonsense muta- 
tions or frameshifts. Additional analysis of sequencing reads for this strain of 
C. albicans found 80% of these predicted errors to be fixable using existing 
data, which led to revision of the consensus sequence and a more complete 
gene set. 

Increasingly, methods have sought to identify the entire transcriptome of 
the yeast genome by direct sequencing or hybridization to tiling 
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micorarrays. As high transcript coverage using RNA-seq is now achievable 
at low cost, these data can be used to annotate gene structures more 
completely, including upstream and downstream untranslated regions. De 
novo gene predictions from RNA sequencing data can achieve high cover- 
age of the annotated genes in S. cerevisiae, although overlapping genes and 
small ORFs still pose challenges to these methods. Gene predictions based 
on transcriptional data suggest that some dubious ORFs are transcribed 
(Yassour et ah, 2009), although they are not broadly conserved between 
species. Recent analysis of large-scale transcript and proteomic data has also 
found that many nonconserved ORFs are transcribed (Li et ah, 2008), 
though many may be none o ding RNAs. 

Population level variation within a species can also provide evidence of 
conservation needed to support gene models. For example, using SNPs 
within a species as a measure of divergence, genes are more highly con- 
served than intergenic regions, although they are also less well conserved 
between strains than species conserved genes (Li et ah, 2008). Such experi- 
ments suggest that multiple sources of high-throughput data may be 
required to identify all the ORFs in a genome. 

In addition to refining gene annotation, comparative analysis can iden- 
tify conserved nongenic sequence, including regulatory elements. By exam- 
ining whole genome alignments between orthologous regions, 
"phylogenetic footprints" highlight functionally constrained regions. 
These elements are far more difficult to identify than genes as they are 
short, typically only 6—1 1 bases in length, contain degenerate bases, and can 
vary in distance from the genes they regulate. For example, computational 
searches in S. cerevisiae identified a small number of conserved motifs in 
intergenic regions, and found them enriched upstream of genes (Cliften 
et ah, 2003; Kellis et ah, 2003); subsequent analysis connected predicted 
motifs to binding site data for individual transcription factors (Harbison 
et ah, 2004). As variation in the sequence of such motifs between species 
may contribute to differences in expression and phenotype (Borneman 
et ah, 2007), detection of binding site occupancy in different species by 
chromatin immunoprecipitation followed by either hybridization to tiling 
microarrays (ChlP-chip) or direct sequencing (ChlP-seq) may uncover 
recent differences in regulatory element usage. 




3. Mechanisms of Genome Evolution 

Dramatic differences in genome size and structure within the fungal 
kingdom reflect the large span of time that led to the emergence of different 
species and as well as a variety of genetic mechanisms. Saccharomycotina 
genomes are small compared to other fungi, but display surprisingly large 
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variations in size between related species. The sequenced Saccharomyces have 
a median genome size of 12 Mb, and a median of ^5900 protein coding 
genes. By contrast, the related Pezizomycotina have a median genome size 
of 36 Mb and contain a median of ~ 10,600 protein coding genes. The large 
differences in genome size, such as the 50% variation between Candida 
guilliermondii and Lodderomyces elongisporus, are due variation in intergenic 
size rather than the number of protein coding genes (Butler et ah, 2009). 
Even within species, genome size can vary widely; the two sequenced 
strains of Blastomyces dermatitidis differ by 9 Mb (Table 34.1). Among 
fungi, the proportion of the genome comprising repetitive sequences can 
also vary substantially; as little as 0.1% of the genome is repetitive in 
F. graminearum whereas about 20% of R. orzyae consists of transposable 
elements. Among strains of S. cerevisiae, repetitive elements can differ in 
number by 10-fold, and some strains with higher repeat content appear 
prone to genome instability (Scheifele et ah, 2009). Compared to the yeasts 
which have few DNA transposons (Dujon, 2006), the genomes of filamen- 
tous fungi contain a larger variety of different transposable element types. 

Genome evolution in the Saccaromyces clade has been shaped by an 
ancient WGD event. Initial analysis of the S. cerevisiae genome revealed 
numerous instances in which the same pairs of genes were found next to 
each other at different locations across the genome. These recurring pairs 
were suggested to have arisen at the same time from a WGD event (Wolfe 
and Shields, 1997). Although WGD was suggested to have doubled the 
number of chromosomes from 8 to 16, most duplicated genes were subse- 
quently lost, with only 12% retained in duplicate in 5. cerevisiae. The 
sequence from additional yeast genomes which did not undergo a WGD, 
Kluyveromyces waltii and Ashbya gossypii, allowed comparison of each WGD 
paralogous gene pair to the ancestral organization of orthologs and provided 
evidence to unequivocally validate this hypothesis (Brachat et ah, 2003; 
Kellis et ah, 2004). Alignments of S. cerevisiae to either A. gossypii or K. waltii 
revealed a "two-to-one" mapping pattern in which two distinct regions 
from 5. cerevisiae map to a single region in each of these other genomes 
(Dietrich et ah, 2004; Kellis et ah, 2003). In all, 90% of the S. cerevisiae 
genome falls into recognizable blocks that map in this manner. Within these 
duplicated blocks in S. cerevisiae, genes were conserved in the same order 
and orientation between the corresponding region in A. gossypii or K. waltii. 
When the presence of genes in either of the duplicated blocks is compared 
to the ancestral version represented by A. gossypii or K. waltii, it is clear that 
differential gene loss has occurred between the blocks in S. cerevisiae. 
Recognizing such a pattern of interleaved gene loss, as well as a clear 2:1 
pairing of centromeres between S. cerevisiae and the preduplication species, 
strongly supports the origin of paired regions by a WGD. 

Evolutionary pressures may act differently on each member within the 
pairs of genes retained in duplicate after a WGA. In Ohno's model of 
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evolution after duplication, one gene maintains the ancestral role whereas 
the other duplicate is free to diverge in sequence and function (Ohno, 
1970). In 5. cerevisiae, nearly all cases of accelerated evolution involve only 
one of the two paralogs of retained duplicate pairs (Kellis et ah, 2004). 
Such computational analyses support a model of neo-functionalization, 
where one of the paralogs retains the ancestral function and the other 
evolves a new function, over a model of subfunctionalization, in which 
each new copy assumes part of the ancestral function (Lynch and Force, 
2000). 

Examining additional postduplication species has shown that the same 
genes are often maintained as WGD duplicates, but the pattern of gene loss 
differs. Analysis of additional loss in C. glabrata and S. castelli compared to 
5. cerevisiae revealed that while one copy of most WGD duplicates was 
resolved prior to speciation, reciprocal and independent loss events were 
also observed. Analysis of Kluyveromyces polysporus, a WGD yeast most 
diverged from 5. cerevisiae, identified a similar fraction of genes retained as 
WGD pairs. Roughly half the gene pairs are found in both species, and they 
are enriched for protein kinases, cell wall organization, and carbohydrate 
metabolism suggesting that subsequent to WGD there was a general advan- 
tage to retaining these genes. For the many genes retained in single copy the 
retained paralog differs, suggesting that the retention of different copies of 
duplicated genes may have played a role in speciation (Scannell et ah, 2006). 

WGD events have been described in other eukaryotes and are likely to 
have occurred numerous times in the long history of fungal evolution. 
Recently WGD has been documented within the Mucorales group of basal 
fungi (Ma et ah, 2009). R. oryzae, the primary cause of mucormycosis, 
contains 9% of genes in duplicated pairs distributed across the genome. 
Comparison to a related Mucorales species, P. blakesleeanus, established a 2:1 
relationship for 78% of the R. oryzae gene pairs. While a similar number of 
genes were retained in duplicate after the WGD events in the Saccharomyces 
and the Mucorales, the genes retained as duplicates comprise quite different 
functional groups. In R. oryzae, the duplicates include nearly all the indi- 
vidual subunits of the proteosome and mitochondrial ATPase. Finer dating 
and analysis of the WGD in R. oryzae will require identification and 
sequencing of a more closely related preduplication genome. 

Duplication of individual genes is also a primary mechanism by which 
new functions evolve. Unlike genes originating with a WGD event, which 
are all created at the same time, individual gene duplications can arise over 
time leading to gene families in which the genes display a wide range of 
sequence similarity. As mentioned above, following gene duplication, alter- 
native fates including neo-functionalization or subfunctionalization can 
create new gene functions. Individual gene duplications played a major 
role in enlarging and diversifying gene families important for pathogenesis 
in Candida species (Butler et ah, 2009). Variation in gene copy number is 
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found more widely in genes involved in transport, the cell wall, and stress 
response across the Ascomycetes (Wapinski et ah, 2007), suggesting greater 
variation in these processes. Strikingly, some genomes are limited in the 
preservation of gene duplicates by a genome defense process called repeat 
induced point mutation (RIP) in which duplicated genes are highly 
mutated during meiosis. As RIP renders duplications nonfunctional, there 
is a strong selection against duplication in this organism. The N. crassa 
genome has been profoundly shaped by RIP, with a dearth of both gene 
duplications and transposons (Galagan et ah, 2003). A similar process appears 
to constrain gene duplication and transposons in F. graminearum (Cuomo 
et ah, 2007). This lack of opportunity for gene innovation via duplication 
suggests that these species may be more constrained in adapting to changing 
environments. 

Related species of Saccharomyces have largely collinear genomes in which 
gene order has been highly conserved in large blocks, termed syntenic 
regions. Within the senso stricto group, only a few inversion and transloca- 
tion events differentiate the sequenced species (Kellis et ah, 2003). The 
breaks in synteny are often accompanied by the presence of repetitive 
sequences, suggesting rearrangements between similar repeats underlie 
these few large-scale structural changes. However, in some species particu- 
lar genomic regions, such as subtelomeres, often display a faster evolutionary 
rate than the rest of the genome. In Saccharomyces, differences between the 
senso stricto species were highest at the subtelomeres (Kellis et ah, 2003). 
Among the subtelomeric gene families undergoing rapid change, in both 
copy number and sequence, are FLO genes and other surface markers that 
govern how cells are recognized by each other and the environment 
(Reynolds and Fink, 2001; Verstrepen et ah, 2005). 

In other species, higher rates of evolution are not confined to the 
subtelomeres. In F. graminearum, a pathogen of wheat, genome-wide diver- 
sity between two strains was high at the subtelomeres, but also at discrete 
internal regions (Cuomo et ah, 2007). Chromosome number in F. grami- 
nearum is reduced compared to related species, and whole genome align- 
ment with F. verticillioides supports a hypothesis that high diversity has been 
maintained at sites of chromosome fusion. Aspergillus sp. also contain geno- 
mic regions of lineage-specific sequence, which include many subtelomeres 
(Fedorova et ah, 2008; Galagan et ah, 2005). As in S. cerevisiae, particular 
gene families are found in the faster evolving regions of the filamentous 
fungal genomes. These include secreted and other proteins implicated in 
plant interactions in F. graminearum (Cuomo et ah, 2007), and secondary 
metabolite gene clusters, transporters, and proteins involved in metabolism 
and detoxification in A. fumigatus (Fedorova et ah, 2008). As a practical 
matter, any genomic region with a large concentration of a given gene 
family or families is likely to be poorly represented in a typical draft genome 
assembly. The software used to assemble shotgun sequence data is 
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confounded by repeated sequences. Thus, the regions and gene families that 
are of greatest interest in defining recent evolutionary events, or popula- 
tion-based differences, are often underrepresented or entirely absent from 
draft assemblies. The fidelity of subtelomeric regions in draft assemblies is 
likely to further degrade as genome sequencing increasingly relies on new 
technologies that produce shorter read lengths from shorter DNA frag- 
ments. For some genomes, targeted sequencing was undertaken to fill in 
genes missing from the subtelomeric regions in draft assemblies (Wu et ah , 
2009). 

The availability of whole genome sequences allows for a comprehensive 
search and phylogenetic analysis of potential horizontal gene transfer events. 
Analysis of the fungal genomes sequenced to date indicates that while 
horizontal transfer within fungi is not as frequent as in bacteria, such events 
can be very important for altering phenotype. One such case was seen in the 
analysis of the genome of the wheat pathogen 5. nodorum. An ortholog of 
the ToxA toxin gene from Pyrenophora tritici-repentis was identified in the 
S. nodorum genome; the ToxA orthologs share 99.7% similarity (Friesen 
et ah, 2006) whereas best bidirectional hit orthologs show a median of 71% 
similarity. The presence of an adjacent hAT transposon suggested that 
transposons may contribute to inter species transfer. Horizontal transfer 
has also been detected between fungi and other eukaryotes, including 
transfer from filamentous fungi to oomycetes (Richards et ah, 2006), 
between plants and fungi (Richards et ah, 2009), and recently between 
viruses and fungi (Frank and Wolfe, 2009). Clusters of linked genes may 
also be transferred as a group, such as those which produce secondary 
metabolites (Khaldi et ah, 2008). An extreme case of horizontal transfer 
may involve conditionally dispensable chromosomes in Fusaria spp., which 
are enriched for genes that do not follow the species phylogeny, display 
atypical GC content and codon usage, and contain a high percentage of 
transposable elements (Coleman et ah, 2009; Ma et ah, 2010). 




4. Genomic Potential for Sex 

Completed genome sequences provide the catalog needed to investi- 
gate the presence of entire pathways or complex biological processes. The 
conservation of genes required for mating and meiosis is sought as evidence 
for whether sexual cycles can occur, and how similar they may be in 
different fungi. While mating and meiosis has been well described in 
Saccharomyces, sexual cycles in other fungi can be less standard, often lacking 
some features common to most eukaryotes. C. albicans can mate through a 
parasexual cycle which is not thought to involve meiosis, and some com- 
ponents of meiosis were found to be missing from the genome sequence 
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(Tzung et ah, 2001). Further analysis of these genes in related Candida clade 
genomes demonstrated that they are missing in all Candida. Strikingly, the 
sexual Candida species are missing additional meiotic and mating genes, 
including proteins involved in synaptonemal complex formation and 
recombination (Butler et ah, 2009). Some of these proteins are also missing 
in other eukaryotes, such as Drosophila and Caenorhabditis elegans, suggesting 
considerable plasticity of the genes involved in meiosis. Innovation of 
meiotic genes within the Saccharomyces was supported by finding many 
meiosis and sporulation genes unique to the Saccharomyces clade among 
Ascomycetes, including the meiotic regulator IME1 (Wapinski et ah, 
2007). For other species analysis of the genome sequence has provided the 
first evidence of a sexual cycle. In Aspergilli, mating was inferred to occur 
in A. jumigatus from the genome sequence (Galagan et ah, 2005), and 
subsequently shown experimentally (O'Gorman et ah, 2009). 




5. Gene Family Conservation and Evolution 

Examining the conservation of genes between species can help predict 
the potential functional capacity of an organism. Such comparative genomic 
analysis permits tracing the history of gene gain and loss events, which 
suggest changes in functional capacity along specific lineages. Protein simi- 
larity is the basis for clustering genes into families, which can then be refined 
based on synteny or phylogeny to establish orthology, which sequence 
divergence can obscure over time. 

Comparative analysis of Saccharomyces genomes suggests that most genes are 
part of a common core set. A. gossypii has a small genome and gene content 
compared to other Saccharomyces, and nearly all (95%) of genes from A. gossypii 
have homologs in S. cerevisiae. This conservation suggests that the core genome 
is ~4500 genes, of which most are present in syntenic regions. The noncore 
gene set for each species includes examples of specialization, for example, the 
loss of the galactose utilization pathway in S. kudriavzevii (Hittinger et ah , 2004) . 

The divergence between the Candida and Saccharomyces clades is appar- 
ent in the considerable variation seen in gene families. In a total of 6209 
gene families examined for seven Candida clade species and nine Saccharo- 
myces clade species only about half (3923) were conserved between the two 
groups (Butler et ah, 2009; http://www.broadinstitute.org/annotation/ 
genome/candida_group/MultiHome.html). By contrast 3765 gene families 
are specific to the Saccharomyces clade and 1521 are specific to the Candida 
clade. Several families important for pathogenesis were identified as 
expanded or unique to the Candida genomes compared to Saccharomyces. 
These include gene families that encode cell surface proteins, including the 
Als adhesins and the Iff/Hyr GPI-linked proteins; other cell wall families 
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such as the FLO genes are highly specific to Saccharomyces. The Als and Flo 
families differ in sequence, but have a very similar structure. Each contains 
intragenic tandem repeats, which can vary in copy number and thereby 
contribute to phenotypic differences (Verstrepen et ah, 2005). In addition to 
variation at the cell surface, differences in nutrient acquisition are found 
with some proteins involved in transport overrepresented in the Candida. 

While S. cerevisiae is separated from the other well-studied yeast S. pombe 
by a long evolutionary distance spanning the filamentous Ascomycetes, 
these species share some similar genomic features. Some genes appear to 
have duplicated in parallel in these species, allowing for adaptation to similar 
environments (Hughes and Friedman, 2003). Gene families specific to the 
yeasts S. cerevisiae and 5. pombe included four nuclear pore proteins, which 
were presumably lost from the ancestor of the filamentous fungi (Cornell 
etah, 2007). 

By examining gene families built across the Ascomycetes, innovations 
within specific subgroups can be identified. A large fraction (84%) of 
S. cerevisiae genes are conserved in orthology groups which also are conserved 
in filamentous fungi (Wapinski et ah, 2007). Although many essential yeast 
proteins are in this conserved set, a significant fraction of spindle pole body 
components are specific to yeast (Cornell et ah, 2007; Wapinski et ah, 2007). 
The larger genomes of filamentous fungi contain expansions of specific gene 
families involved in transport of molecules in and out of cells and DNA 
binding proteins. The expansion of DNA binding proteins may suggest 
more complex regulation of transcription required to control a larger number 
of genes in the filamentous fungi. Within the filamentous plant pathogens, 
there is particular innovation of genes involved in secondary metabolite 
production, including cytochrome p450 and polyketide synthases (Soanes 
et ah, 2008). Such analysis in other more basal fungal phyla will be increasingly 
possible as the number of sequenced genomes increases. 



6. Impact of Next-Generation Sequencing 

The advent of new sequencing technologies that produce dramatically 
less expensive data at vastly higher throughput is elevating the use of 
sequencing as a general purpose tool for studying genomes and biology. 
Several different sequencing technologies such as Illumina, 454, and ABI 
SOLiD are now in routine use, with more expected soon. Each uses 
amplification of DNA fragments and does not rely on propagation of 
bacterial clone libraries; sequencing chemistries differ, as do other properties 
such as read length and quality profiles which rapid development continues 
to improve. Each of these methods has distinct advantages, preferred appli- 
cations, and cost structures, though the ongoing rapid changes in these 
platforms means that the relative differences are likely to shift. 
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New laboratory methods and assembly algorithms are still being optimized 
to address the inherent difficulty in assembling genomes from the shorter 
sequence reads produced by these new methods. However, early results 
indicate that high-quality genome assemblies can be produced by these new 
technologies. Resequencing a previously finished genome provides the most 
rigorous validation of any new sequencing strategy. We have resequenced N. 
crassa using several new technologies, along with genomes that have draft 
sequences but have higher amounts of repetitive DNA. After testing a variety 
of coverage depths, sequencing fragment and read length, and technology 
type, assemblies can be produced that rival or greatly surpass the quality of 
those generated by traditional Sanger sequencing, according to measures such 
as genome coverage, contig length, and consensus base quality (C. Nusbaum, 
personal communication). Further, new genome sequencing methods can 
capture genomic sequences that are typically hard to obtain by older methods 
that relied on cloning genomic fragments in bacteria. For example, sequenc- 
ing N. crassa genome with 454 captured over 1 Mb of additional genome 
sequence that was missing from the original assembly based on Sanger- 
sequencing. The sequences obtained only by 454 were found to be much 
higher in AT content than average and were absent from the bacterial clone 
libraries. Finally, it is important to note that the overall quality of the 
consensus sequence found in draft assemblies produced from either 454 or 
Illumina is extremely high. In fact, the consensus quality within contigs of 
these draft assemblies matches that of finished sequence from prior sequencing 
data. Assembling short read sequencing data from fungal genomes generally 
yields assemblies with short contig lengths. Providing additional linking 
information, for example, by including Sanger paired-end sequence from 
40 kb Fosmids, or end sequences from jumping fragments, will improve the 
continuity of the assembled sequence. Thus, N. crassa assemblies generated 
from a hybrid of 454 and Sanger sequence from Fosmid ends are comparable 
to the previous Sanger-only in terms of genome coverage and contig size. For 
a more repetitive genome, such as P. brasiliensis, contigs are about fourfold 
smaller on average than an assembly based on Sanger sequence alone (though 
again, the hybrid assembly combining both data types covered more of the 
genome). More recent work demonstrates that draft sequences can be assem- 
bled for 40 Mb fungal genomes from Illumina data, which points to even 
further reductions in cost, if solutions to the assembly of highly repetitive or 
polymorphic genomes are found. 

In contrast to the large number of draft genome assemblies that have been 
produced, only a few have been fully finished, that is, have been completely 
sequenced from telomere to telomere. The paucity of finished sequence 
reflects the much greater cost and effort required compared to generating 
draft sequences. Although certain key fungi will warrant the expense of fully 
finished sequence, large-scale efforts are likely to continue to focus on draft 
genomes, especially as the cost differential between draft and finished 
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sequence is likely to grow with the adoption shotgun sequencing methods 
that use new technologies. Hence, it is important to consider potential 
consequences of this distinction. Systematic losses of specific classes of 
sequence from draft assemblies routinely impact genome analysis. Repetitive 
sequences are frequently missing from draft genome assemblies or are greatly 
underrepresented. For example, repeat elements at centromeres and telo- 
meres, or local gene clusters with highly repeated domains are rarely found 
intact in genome assemblies, and the highly repeated rDNA genes are 
underrepresented in draft assemblies. Repeated gene families that lie in 
subtelomeric regions are also often incorrectly assembled or missing from 
draft assemblies. Because these subtelomeric regions often harbor rapidly 
evolving genes that hold clues to an organism's most recent history and 
ecological specialization, draft assemblies are a poor substrate for comparisons 
of very recently diverged relatives, strains, or isolates. As a rule, multiple lines 
of evidence, including experimental data, are needed to confirm genes 
suspected to be missing from a genome based on draft assemblies. 

The amount and location of sequencing missing from a draft assembly 
can be estimated by comparison to an independent genome map, such as a 
physical, genetic or optical map, or to a reference gene set, such as from 
ESTs. Optical maps, which are based on automated restriction fragment 
measurements (Samad et ah, 1995), have proven practical and highly useful 
for fungal genomes; for the highly repetitive genome of R. oryzae, an optical 
map was instrumental in evaluating and improving the large-scale accuracy 
of the assembly (Ma et ah, 2009). Alignment of genome sequence to the 
map not only allows sequence contigs to be correctly anchored and ordered 
along the chromosomes but also provides a measure of the size of sequence 
gaps. In addition to estimates of missing sequence based on optical maps or 
known genome sizes, the extent of missing genes can be determined by 
aligning a reference gene set or a set of ESTs to the assembly, and assessing 
the ones which do not align for genie potential. Researches with interests in 
specific regions of the genome that are challenging for whole genome 
sequencing strategies, such as repetitive regions and segmental duplications, 
need to consider targeted sequencing efforts to better deal with the 
challenges for representing these regions. 



7. Future Directions 

The growing ease and lower costs of genome sequencing will continue 
to profoundly alter fungal research. In the past, genome sequencing was the 
domain of large specialized centers, but in the future individual investigators 
will be encouraged to sequence new fungi as part of their research agenda. 
This change is likely to produce sequence from a far greater range of organisms 
and these resources will further enable research on a range of fungi. 
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Thus, while genome sequencing has previously emphasized a single 
Phylum, the Ascomycota, recent efforts seek to increase coverage of the 
basally branching groups of fungi. Survey sequencing for use in building 
multigene species phylogenies is becoming more cost-effective. Greater 
sequence coverage of basal fungi and their closest relatives is essential to 
define the commonalities of all fungi. Additionally, sequencing of additional 
early diverging eukaryotes will allow more detailed comparison of genomic 
changes that differentiate fungi from their nearest eukaryotic relatives. 
Further sequencing of microsporidia, a group placed near or within the 
fungal kingdom, should help shed light on the evolution of this group and 
clarify its phylogenetic relationship to other fungi. 

As sequencing becomes cheaper, the ability to sequence multiple strains 
from a population, or even populations of fungi, becomes feasible. While 
initial sequencing efforts targeted single representatives of a species, the new 
methodologies will permit a more comprehensive understand of the genome 
structure of an organism, by understanding diversity within the population 
and the population's response to selective pressures. Future studies will be 
able to sequence hundreds of strains or isolates for what a single genome cost 
just a few years ago. Two recent studies examined polymorphism in collec- 
tions of ^70 S. cerevisiae strains each, from diverse geographic and source of 
origin (Liti et ah, 2009; Schacherer et ah, 2009). Sequencing multiple strains 
allow maps of the diversity within a species, which in turn can be used to 
identify genes and regions under heightened selective pressures. For exam- 
ple, one could identify rapidly evolving genes that may be important in the 
recent evolution of emerging pathogens, or highly constrained sequences 
that may be important targets for diagnostics, vaccines, or therapeutic 
measures. Finally, fungal researchers are likely to benefit from the power of 
new sequencing technologies to quantify expression. The combination of 
high dynamic range, base pair resolution, and rapidly falling costs make 
this avenue highly attractive. Already direct RNA sequencing using new 
technologies is adding much needed precision to the task of genome 
annotation. A single lane of Illumina sequence represents a comparable 
cost to a micro array experiment, and bare o ding of samples allows greater 
throughput. Sequencing by these new approaches also has the advantage of 
simultaneously detecting variation in transcripts, which could be used to 
assay mutations between samples or differences in expression of alleles. 
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Abstract 

Budding yeast are capable of displaying various modes of oscillatory behavior. 
Such cycles can occur with a period ranging from 1 min up to many hours, 
depending on the growth and culturing conditions used to observe them. This 
chapter discusses the robust oscillations in oxygen consumption exhibited by 
high-density yeast cell populations during continuous, glucose-limited growth 
in a chemostat. These ultradian metabolic cycles offer a view of the life of yeast 
cells under a challenging, nutrient-poor growth environment and might repre- 
sent useful systems to interrogate a variety of fundamental metabolic and 
regulatory processes. 




1. Introduction 

Many cyclic and oscillatory phenomena can be observed in nature. 
Perhaps the most recognized of such periodic phenomena are the circadian 
rhythms present in virtually all kingdoms of life. Although the budding yeast 
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Saccharomyces cerevisiae does not seem to display any bona fide circadian 
behavior, yeast cells have long been known to be capable of exhibiting 
other modes of oscillatory behavior, typically with a period much shorter 
than 24 h (Richard, 2003). 

Among the first observed of such oscillations were reduced pyridine 
nucleotide oscillations that occurred with a period of ^ 1 min both in cell- 
free extracts and intact cells (Chance et ah, 1964a,b; Ghosh and Chance, 
1964; Hommes, 1964). Subsequently, oscillations in oxygen consumption 
and other physical parameters were observed during continuous culture of 
yeast cells using a chemostat (Parulekar et ah, 1986; Porro et ah, 1988; 
Satroutdinov et ah, 1992; von Meyenburg, 1969). These sustained oscilla- 
tions displayed periods ranging anywhere from ~40 min to over 10 h, 
depending on the strain as well as the cultivation protocol. 

Long-period oscillations of oxygen consumption on the order of several 
hours were first reported by several groups during continuous, glucose- 
limited growth (Kuenzi and Fiechter, 1969; Parulekar et ah, 1986; Porro 
et ah, 1988; von Meyenburg, 1969). Later on, short-period ~40-min 
oscillations during continuous culture growth were observed under higher 
glucose concentrations (Satroutdinov et ah, 1992). Collectively, these stud- 
ies demonstrated that under steady-state, nutrient-poor growth conditions, 
synchronous changes in a variety of metabolic parameters could be observed 
in yeast cells that repeat over time (Kuenzi and Fiechter, 1969; Porro et ah, 
1988; Satroutdinov et ah, 1992; von Meyenburg, 1969). Consequently, the 
use of chemostats to create controlled growth environments has revealed 
interesting aspects of the behavior of yeast cells that would otherwise be 
difficult to observe using traditional batch culturing methods. In essence, 
these ultradian cycles illustrate how yeast cells might cope with a demanding 
growth environment that is not replete with glucose and other nutrients. 




2. Induction of Ultradian Cycles of Oxygen 
Consumption Using a Chemostat 

Several prototrophic strains have been observed to exhibit ultradian 
cycles of oxygen consumption during continuous growth in a chemostat 
(Parulekar et ah, 1986; Porro et ah, 1988; Satroutdinov et ah, 1992; Tu et ah, 
2005; von Meyenburg, 1969). Our preferred strain to study such cycles is 
the CEN.PK prototroph (van Dijken et ah, 2000). The CEN.PK strain 
grows more rapidly than the common S288C laboratory strain and is 
genetically tractable (Fig. 35.1). The use of a chemostat to achieve steady- 
state growth conditions is critical for the observation of the robust oscilla- 
tions in oxygen consumption that are a hallmark of these cycles. The 
growth media and general procedure to induce long-period (^ 4—5 h) cycles 
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Figure 35.1 Comparison of colony size between CEN.PK and S288C. Note that the 
prototrophic CEN.PK cycling strain displays more robust growth than the common 
laboratory strain S288C. Cells were grown on YEPD plates at 30 °C for 48 h. 

Table 35.1 Chemostat media recipe 



(NH 4 ) 2 S0 4 


5g/l 


KH 2 P0 4 


2g/l 


MgS0 4 


0.5 g/1 


CaCl 2 


0.1 g/1 


FeS0 4 


0.02 g/1 


ZnS0 4 


0.01 g/1 


CuS0 4 


0.005 g/1 


MnCl 2 


0.001 g/1 


Yeast extract 


lg/1 


Glucose 


10 g/1 


70% H 2 S0 4 


0.5 ml/1 


Antifoam 


0.5 ml/1 



using a chemostat is detailed below (Table 35.1). In order to observe short- 
period (~ 40 ruin) oscillations as described by Kuriyama, Klevecz, Murray, 
and colleagues, the same general procedure is used except the glucose 
concentration is 20 g/1 (2%) and use of the polyploid strain IFO0233 is 
recommended (Klevecz et ah, 2004; Satroutdinov et ah, 1992). 



2.1. Chemostat setup 

Following sterilization, the chemostat is set to maintain a constant tempera- 
ture of 30 °C and a pH of 3.4. The chemostat vessel is aerated with house air 
at ^ 1 1/min and agitated at ~ 450 rpm to maintain sufficient aeration for the 
cell population. The exact agitation and aeration settings will be dependent 
on the chemostat setup; the key is to provide sufficient aeration so that 
molecular oxygen never becomes limiting in the growth environment, 
which can be an issue at high cell densities. The pH is kept constant at 
3.4 by regulated dosing of 1 M sodium hydroxide. Cycles can be observed 
at a range of pH values (3—5.5); the acidic conditions help minimize 
contamination of the growth media. 
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Once the pH and temperature have stabilized, the chemostat vessel is 
seeded with an overnight starter culture and the cells are allowed to grow up 
to density in batch mode (OD 600 ~ 10—12). A return to ~ 100% d0 2 levels 
signifies that the cells have exhausted available carbon sources in the media 
(Fig. 35.2). After starvation for a period of at least 6 h, the cell population is 
continuously fed the same media at a dilution rate of ~ 0.09 h , which 
leads to the production of long-period, ~4— 5 h cycles. Shortly after the start 
of continuous mode, the cell population becomes highly synchronized and 
exhibits robust oscillations in oxygen consumption (Fig. 35.2). The cycles 
will persist as long as media is continuously supplied to the cells. We have 
observed up to ^100 consecutive metabolic cycles over the course of 
3 weeks without significant loss of synchrony. 



2.2. Comments 

The growth media contains nitrogen, phosphate and sulfur sources, essential 
metals (Ca , Mg , Fe , Zn , Cu , Mn ) and glucose as the carbon 
source, and limiting nutrient. A little bit of yeast extract (0.1%) supplies 
remaining trace vitamins and cofactors. Yeast nitrogen base without amino 
acids and ammonium sulfate can supplement for the yeast extract. The 
sulfuric acid acidifies the media which minimizes contamination. 
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Figure 35.2 Long-period metabolic cycles during continuous, glucose-limited 
growth. During batch mode, the cells are grown to a high density and then starved 
for a short period. During continuous mode, media containing glucose is introduced to 
the chemostat culture at a constant dilution rate (^0.09 h ). These metabolic cycles 
are comprised three phases: Ox, RB, and RC. 
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Long-period cycles have also been observed to occur at 25 and 37 °C, 
the period length is approximately the same at different temperatures; 
however, the shape of the oscillations in oxygen consumption is different 
(Tu, unpublished data). In this respect, the cycles exhibit temperature 
compensation. Higher dilution rates lead to shorter cycles and lower dilu- 
tion rates lead to longer cycles. Both haploid and diploid strains can cycle, 
although under the same culturing conditions diploid cycles are ~25% 
longer (Tu, unpublished data). Thus far, we have not been able to observe 
robust, long-period cycles using prototrophic derivatives of the common 
laboratory strains S288C and W303. 




3. Long-Period Cycles 

Long-period cycles have a period ranging anywhere from 2 h to 
upward of 10 h depending on the chemostat dilution rate. These cycles 
are characterized by phases of rapid oxygen consumption that alternate 
with phases of slower oxygen consumption. Interestingly, a variety of 
additional parameters such as budding index, storage carbohydrate con- 
tent, ethanol levels, and carbon dioxide production have been observed 
to oscillate as a function of these cycles, although not necessarily in phase 
with the dissolved oxygen oscillation (Kuenzi and Fiechter, 1969; 
Porro et al, 1988; Tu et al., 2005, 2007; Wang et al., 2000; Xu and 
Tsurugi, 2006). 

Microarray analysis of gene expression during ~5 h long-period cycles 
of a CEN.PK diploid has revealed the nature of the cellular events that 
occur as a function of such cycles (Tu et al, 2005). Over half of yeast genes 
were found to exhibit robust cyclic expression. Gene products with func- 
tions associated with energy and metabolism and those localized to the 
mitochondria tended to be expressed periodically. For these reasons, these 
long-period cycles of oxygen consumption were given the name "yeast 
metabolic cycle" or "YMC" (Tu et al, 2005). 

Analysis of the long-period expression dataset revealed three superclus- 
ters of gene expression, which were used to define three major phases: 
Ox (oxidative, respiratory), RB (reductive, building), and RC (reductive, 
charging) (Tu et al, 2005). Different categories of genes peak during each 
phase, and cells traverse each of these three phases in every metabolic cycle. 
The Ox phase represents the peak of respiration and is associated with a 
transient induction of ribosomal genes and many genes involved in growth. 
Cell division and the upregulation of genes that encode mitochondrial 
proteins occur during the RB phase, when the rate of oxygen consumption 
begins to decrease. Then in the RC phase, many genes associated with 
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starvation and stress-associated responses (e.g., ubiquitin-proteasome, 
vacuole, autophagy, heat shock proteins) are activated prior to the next 
Ox phase (Tu et ah, 2005; Brauer et ah, 2008; Fig. 35.2). 

The extensive orchestration of gene expression around bursts of respira- 
tion indicates that many essential cellular and metabolic processes such as 
cell division, mitochondria biogenesis, ribosome biogenesis, fatty acid 
oxidation, and autophagy tend to occur during specified temporal windows 
of these cycles (Tu et ah, 2005). Temporal compartmentalization of such 
processes might enable cells to execute a variety of processes in a more 
coordinated and efficient fashion and help minimize futile reactions. 

The many oscillating gene expression patterns that occur as a function of 
these cycles predict oscillatory changes in the metabolic state of yeast cells. 
Both liquid chromatography— tandem mass spectrometry (LC— MS/MS) and 
2-D gas chromatography/time-of-flight mass spectrometry (GCxGC— 
TOFMS) metabolite profiling methods were used to monitor the intracel- 
lular concentrations of over ~150 common metabolites at different time 
intervals of the metabolic cycles (Tu et ah, 2007). The results of these 
surveys show that many metabolites including amino acids, nucleotides, 
and carbohydrates, oscillate in abundance with a periodicity precisely 
matching that of the cycles. Consequently, many fundamental biological 
processes can be predicted to be intimately coupled to these cyclic changes 
in cellular metabolic state. 




4. Short-Period Cycles 

The majority of studies on short-period (^40-min) ultradian cycles 
have been carried out with the polyploid IFO0233 strain. These cycles are 
also defined by robust oscillations in oxygen consumption. However, these 
oscillations occur on a more rapid timescale, perhaps due to higher cell 
densities that result from higher glucose concentrations in the media. Low- 
amplitude oscillations of transcription as well as metabolite levels were 
reported to occur during these 40-min cycles (Klevecz et ah, 2004; 
Murray et ah, 2007). Furthermore, gating of DNA replication was observed 
in these short-period cycles, despite the cell division time exceeding the 
period length of these cycles (Klevecz et ah, 2004). Surprisingly, there 
appears to be minimal phase correlation between the set of short-period 
cyclic transcripts and the set of long-period cyclic transcripts (Tu and 
McKnight, 2006). How cycles of transcription, translation, mRNA degra- 
dation, and protein turnover can occur on such a short timescale remains a 
fascinating but outstanding question. More information on these short- 
period oscillations can be found in the following reviews on the topic 
(Klevecz et ah, 2004; Lloyd and Murray, 2005; Murray et ah, 2001). 
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5. Significance of Ultradian Cycles 

The ability to undergo robust metabolic cycles during continuous 
growth appears to be a feature of prototrophic yeast strains that do not 
require supplementation of amino acids or nucleobases for growth. More- 
over, there are several significant reasons to favor the use of prototrophic 
strains as opposed to more convenient auxotrophic strains for studies of this 
sort. Mutations or deletions in metabolic genes present in auxotrophic strains 
might compromise the output of numerous cellular and metabolic pathways 
and elicit compensatory responses that are not typical of wild strains of yeast. 
For example, an auxotrophic strain that cannot synthesize adenine, uracil, or 
methionine will be absolutely dependent on supplementation of these meta- 
bolites for growth, which will undoubtedly alter flux through key metabolic 
pathways and hence the regulation of particular cellular processes (Boer et ah, 
2008). Moreover, under nutrient-poor conditions, these supplemented 
amino acids and nucleobases might be interconverted to other metabolites 
and become limiting for growth once again. Therefore, it is not surprising 
that auxotrophic strains might not exhibit robust cycles under conditions 
that necessitate glucose being the sole limiting nutrient. 

How do the growth conditions in the chemostat that lead to cycles of 
oxygen consumption compare to more commonly used exponential phase 
growth conditions? Microarray analysis of gene expression during long- 
period cycles has shown that many genes that encode proteins with starva- 
tion and stress-associated functions that are normally not expressed during 
log phase are actually induced during particular temporal windows 
(Tu et ah, 2005). Moreover, periodic bursts of mitochondrial respiration 
and peroxisomal j6-oxidation are a hallmark of such cycles, whereas in 
exponential phase fermentation is the predominant mode of energy pro- 
duction. Indeed, many mitochondrial and peroxisomal proteins are not 
present in high quantities during log phase growth (Ghaemmaghami et ah , 
2003). Therefore, it is clear that during cycling yeast cells utilize a more 
extensive assortment of metabolic and regulatory strategies than compared 
to log phase (Tu and McKnight, 2006; Tu et ah, 2005). Thus, the chemostat 
conditions that bring about such cycles can be likened to a glucose-limited 
and nutrient-poor growth environment. 

While the use of nutrient-rich, log phase growth conditions is conve- 
nient and often informative, they might not be typical of conditions in the 
wild, which are likely to be nutrient-poor. Moreover, yeast typically grow 
in colonies, where the cell density is very high and the distance between cell 
neighbors is small. Thus, the dense population of cells in the chemostat that 
undergo ultradian metabolic cycles to a first approximation might resemble 
a colony exposed to a nutrient-poor environment. 
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The high degree of synchrony exhibited by yeast cell populations under- 
going ultradian cycles is self-evident. Following starvation, cells usually 
rapidly self-synchronize upon initiation of continuous mode. The precise 
mechanisms that lead to the establishment and maintenance of synchrony 
are not entirely clear; however, some secreted metabolites have been 
implicated in the process (Murray et ah, 2003; Porro et ah, 1988). Regard- 
less, this metabolically achieved synchrony is remarkably stable, having 
lasted up to 100 cycles (~20 days) (Tu, unpublished data). Since cells 
undergoing long-period metabolic cycles are also highly synchronized 
with respect to the cell cycle, the microarray expression data have enabled 
the high resolution timing of cell-cycle regulated gene expression to a 
previously unachievable resolution of ~ 2—3 min (Ro wicka et ah , 2007) . 
Thus, these cycling systems might facilitate the study of almost any tempo- 
rally regulated process or pathway. 

Perhaps the most fascinating aspect of these ultradian metabolic cycles is 
the manner by which so many metabolic outputs are precisely orchestrated 
about bursts of mitochondrial respiration. One particularly striking example 
is the observation that DNA replication and cell division are precisely gated 
to temporal windows when oxygen consumption decreases (Chen et ah, 
2007; Klevecz et ah, 2004; Kuenzi and Fiechter, 1969; Porro et ah, 1988; Tu 
et ah, 2005; von Meyenburg, 1969). The gating of cell division is reminis- 
cent of the circadian gating of cell division previously observed in cyano- 
bacteria, mouse liver, and cultured fibroblasts (Matsuo et ah, 2003; Mori 
et ah, 1996; Nagoshi et ah, 2004). Moreover, yeast cells secrete ethanol, a 
product of glycolytic metabolism and become much less dependent on 
mitochondrial respiration as they enter the cell cycle, which is reminiscent 
of cancer cell division and the Warburg effect (Tu et ah, 2005, 2007; 
Warburg, 1956). How cell division as well as other fundamental cellular 
processes are intimately coordinated with the metabolic state of a cell 
remain important open questions. 

In conclusion, a chemostat enables the maintenance of constant pH, 
temperature, aeration, and nutrient levels thereby creating steady-state 
growth conditions that are not easily achievable in batch cultures. In turn, 
it becomes possible to observe the behavior of yeast cell populations with 
minimal interference from external variables. Based on the precise, coordi- 
nated expression of many groups of genes in a manner that makes biological 
sense, these ultradian metabolic cycles offer a view of the diverse metabolic 
and regulatory strategies undertaken by a yeast cell under glucose-limited, 
nutrient-poor conditions. It is hopeful that the study of these prototrophic 
yeast strains undergoing synchronized metabolic oscillations will contribute 
toward our understanding of biological cycles as well as complex metabolic 
diseases such as cancer and aging. Leo Szilard, one of the inventors of the 
chemostat, fittingly predicted almost 60 years ago: "A study of this slow- 
growth phase by means of the chemostat promises to yield information on 
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some value on metabolism, regulatory processes, adaptations, and mutations 
of microorganisms" (Novick and Szilard, 1950). 
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amplification, PCR, 91 

ChIP samples, 86 

illumina, 87 

PCR purification, 88 
DNA ligation kit, 116 
DNA microarrays, gene function analysis 

array scanning, 14 

cell growth, 9 

double-mutant 

fundamental components, 6-7 
genome- wide interaction map, 6 

dye labeling, 11-12 

experimental design, 8-9 

gridding, 14-15 

hybridization, 12—13 

microarray washing, 13 

normalization, 15 

poly-A RNA purification, 10-11 

RNA isolation and purification, 9-10 

single-mutant 

gene deletion/mutation, 5 
mRNA levels, 4 
secondary effects, 4-5 
Drug dosage determination 

materials, 240 

prescreen compounds, 238-239 



ECVPs. See Extended copy-number variation 

profiles 
Electron multiplying CCD (EMCCD) 
description, 591-592 
manufacturers, 592 
Electron transfer dissociation (ETD) 
vs. CID, 264 

fragmentation process, 327 
phosphopeptides identification, 327-328 
Electrospray ionization (ESI) 

LC output and gas-phase ions, 407 
nonvolatile molecules, 263 
physical mechanisms, 410 
pitfall, 411 
Endoplasmic reticulum-asso dated degradation 

(ERAD) 
Cdc48 complex, 665 
integral membrane proteins, 673—677 
microsomes, 666 
in vitro assays 

ATP regenerating system, 668-669 

degradation, 669—671 

AGppaF isolation, 667 

microsome preparation, 666 

paF retro translocation assay, 671-673 

yeast cytosol, 667-668 
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Epistatic mini-array profile (E-MAP) approach 
gene selection, 210—211 
strategies, 207-208 
ESI. See Electrospray ionization 
Experimental evolution, yeast 
analysis techniques 

fitness, 501-502 

population genetics, 501 

sampling regimen, 500 
chemostats 

assembly, 503 

definition, 491 

equipment, 491-492 

preparation, 502 
data, 504 
design 

duration, 495-496 

growth conditions, 493-494 

population size, 494-495 
growth rate, 498 
hygiene, strain, 499-500 
inoculation, 503 
media, 497-498 
medium formulation, 502 
miniaturization, 493 
rationales 

facile technique and synthetic biology, 490 

genome structure and organization, 489^-90 

sex and ploidy, 488-489 
record-keeping, 500 
sampling 

daily, 503 

weekly, 503-504 
serial dilution 

advantages, 490-491 

description, 490 

nutrient exhaustion, 491 
sterile practices, 499 
strains and markers 

flocculant growth, 497 

metabolic selection, 496 

mutation, 496-497 
turbidostat 

commercial options, 492-493 

growth rate, 493 
Extended copy-number variation profiles (ECVPs) 
interaction network 

biochemical / genetic, 472—474 

phylogenetic history, 474-475 

proximity relation, 472 
orthogroup classes 

centroid, 471-172 

functional constraints, 172 



Fluorescence in situ hybridization (FISH), 630, 631 
Fluorescent proteins (FPs) 

detectability, 595-597 

homologous recombination, 594-595 



photostability, 597 
Fourier-tranform mass spectrometers, 260 
Freeze-substitution (FS) 
immunolabeling, 610 
infiltration problems, 608—609 
media, 606 
principle, 605—606 
temperature variations, 608 
FS. See Freeze-substitution 
Fungal genome sequencing 
evolution mechanisms 

ortholog, 847 

RIP, 846 

size, genome, 843-844 

subtelomeres, 846-847 

WGD, 844-845 
gene family conservation, 848-849 
impact, 849-851 
kingdom, fungal 

assemblies, 835-838 

classification, 840-841 

definition, 834 

evolution, 834, 840 

WGD, 839-840 
multiple strains, 852 
noncoding elements, computational prediction 

comparative analysis, 842 

transcriptional data analysis, 841, 843 
pontential, sex, 847-848 
Fusion PCR. See Fusion polymerase chain 

reaction 
Fusion polymerase chain reaction (PCR), 804-808 



Gas chromatography (GC), 371 
Gateway cloning system, 354, 356 
Gaussia luciferase (Glue), 362-365 
GC— MS analysis, sterol 
composition, 387 
SPE separation, 387, 389 

wild-type (WT) and erg mutant strain, 388-389 
Gene deletion marker switch method, 149—150 
Gene function and drug action exploration 
analysis, 253-254 
array normalization, 248 
assay results, HIP, HOP and MSP, 236-237 
barcode sequences and array options, 253 
drug dosage determination 
materials, 240 

prescreen compounds, 238-239 
experimental pool growth 

deletion profiling (HIP/HOP), 240-241 
materials, 241 
hybridization 
materials, 245 
process, 244-245 
log2 ratio calculation, 249 
microarray data confirmation 
strains deletion, 250-252 
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Gene function and drug action exploration (cont.) 
96-well plate format, 250 
MSP 

pool construction, 238 
ratios calculation, 249 
outlier masking, HIP and HOP, 246 
pooled screening analysis, 234-235 
purification and amplification, barcodes and 
ORFs 
deletion profiling (HIP/HOP), 241-243 
MSP, 243 
replicates and OD values, 252 
saturation correction, HIP, HOP and MSP, 

246-248 
unusable tags removal, 248-249 
yeast deletion and strain pool construction, 
237-238 
Gene history reconstruction, ascomycota fungi 
biological analysis 

copy-number variation profiles, 471—477 
gene sets and orthogroup projection, 468—471 
orthogroup categories, 466-468 
singletons and ORF predictions, 468 
orthogroup quality evaluation 

curated resource comparison, 463-465 
orthologs and reconstruct gene trees, 460-461 
robustness, 461-463 
simulated, 465-466 
WGD, 459-460 
orthologous and paralogous identification, 

448-449 
paralogous analysis 

conserved interactions, divergence, 479-480 
functional divergence, 478-479 
phylogenetic gene tree, 449-450 
synergy 

gene similarity scoring, 453-454 
graph, gene similarity, 454 
identification, orthogroups, 454-459 
orthogroups definition, 451-453 
Gene ontology (GO), 5, 24 
Genetic interaction mapping, E-MAP approach 
biological hypotheses extraction 

dissecting multiple role, single gene, 

228-229 
enzyme-substrate relationship prediction, 

225, 227 
genes acting identification, 224-225, 226 
opposing enzyme relationship prediction, 
227 
data processing and computation, scores 
expected colony size, 223 
genetic interaction scores, 223-224 
preprocessing and normalization, 221—222 
quality control, 224 
double mutant strains 

average colony size measurements, 
213-214, 215 



digital photography, 221 

PEM protocol, 219-220 

protocols comparison flowchart, 211—212 

SGA protocol, 214-218 

singer RoToR pinning station, 214 

technical details, 213 
HIR and CAF complexes, 206-207 
mutation selection, 209-211 
phenotype differences, 208-209 
strategies, 207-208 
Genetic interactions 

positive and negative, 157 
systematic effects and normalization 
procedures, 158—159 
Genome-wide translational profiling 
data analysis 

gene expression quantification, 136-138 

high-quality alignment selection, 136 

polyadenylated sequence mapping, 134-135 

reference databases, 135-136 
gene expression, measurement, 120 
nucleic acid 

gel extraction, 140 

precipitation, 139-140 
oligonucleotides, 139 
ribosome footprint generation 

cycloheximide addition, 121 

extract preparation, 122—123 

fragment purification, 124—126 

mRNA preparation, 126—128 

nuclease digestion and monosome 
purification, 123-124 

procedure, 121—122 
sequencing library preparation 

circularization, 130 

optimization structure, 128-129 

partial alkaline hydrolysis, 128 

PCR amplification, 131-133 

polyadenylation, 129-131 

reverse transcription, 130, 132 
solution, 138-139 
Global transcription machinery engineering 

(gTME), 511 
GLOX solution, 440-441 
GO. See Gene ontology 

H 

Haploinsufficiency profiling (HIP) , 234 
HCS. See High content screening 
High content screening (HCS) 

cell biological phenotypes, 171-172, 173 

media, 172 

statistical analysis and data mining, 174 
High-pressure freezing (HPF) 

filtration, 614-615 

instrument mobility and correlative 
microscopy, 605 

yeast preparation, 606—611 
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High-quality binary interactome mapping 
biological evaluation, 291-292 
false positive 

biological, 285 

technical, 284-285 
high-throughput strategies, 282-283 
high-throughput Y2H pipeline 

autoactivator removal and AD-Y pooling, 
298-302 

DB-X and AD-Y expression plasmids, 
292-296 

media and plates, 308-310 

screening and phenotyping, 302-305 

verification, 305-307 

yeast transformation, 296-298 
interactome maps, 283 
orthogonal binary interaction assays 

Well-NAPPA, 311-312 

YFP-PCA, 310-311 
Y2H datasets 

autoactivators, 286 

candidate interactions retesting, 286—287 

implementation, 287—288 

validation, 288-291 
High-resolution mass spectrometry 
analysis 

data-dependent mode, 276 

equipment, 275—276 

experiment data, 276—277 

sample preparation/injection, 276 
complex protein mixtures, 262 
computational proteomics and data analysis, 

268, 270, 271 
dauntingly complex mixtures, 261 
LTQ-Orbitrap Velos, 270 
MaxQuant and novel bioinformatic tools, 

270, 272 
peptide 

IEF, 275 

and proteins identification and quantitation, 
277 
protein in-solution digest, 274 
proteome-scale experiment, 261—262 
quantitative proteomics 

isotope-coded affinity tag assay, 267 

label free approach, 266-267 

in vivo labeling, 268, 269 

metabolic labeling, 267-268 
shotgun approaches 

ESI, 263 

ETD, 264 

ion traps, 265 

LTQ-Orbitrap, 265-266 

MALDI, 263-264 

metabolic labeling, 267-268 

MRM, 264-265 

peptide ions separation, 262-263 

quadrupoles, 264 



SILAC 

culture, yeast, 273 

extract prepartion, 273-274 

incorporation, 274—275 

labelling media, 272-273 

yeast strains, 272 
use, 260 
High-throughput Y2H pipeline 
AD pooling 

AD-Y clones, 300-301 

construction, 300 

yeast cell lysis, 301 

yeast lysate PCR, 301-302 
autoactivator identification and removal, 

298-300 
candidate interaction pair verification, 305-307 
DB-X and AD-Y expression plasmids 

bacterial transformation, 294-295 

destination vectors restriction digestion, 
293-294 

gateway cloning, 292-293 

gateway LR recombinational cloning, 294 

PCR, bacterial culture, 295-296 
nonselective rich yeast medium (YEPD), 308 
phenotyping, 304-305 
primary screening, 302—304 
yeast media 

amino acid powder mix and stock solutions, 
309-311 

synthetic complete (Sc), 309 
yeast transformation, 296-298 
HILIC. See Hydrophilic interaction 

chromatography 
Hit- clustering methods, 448-449 
Homozygous profiling (HOP), 234 
HPF. See High-pressure freezing 
Human pathogen cryptococcus neoformans 
culture, techniques 

dominant drug selection markers, 803-804 

YPAD, 802-803 
haploid yeast, 798-799 
life cycle, 800-802 
molecular biology techniques 

biolistic tansformation, 808-812 

colony PCR, 812-813 

genomic DNA extraction, 814—815 

protein extraction for SDS— PAGE, 817 

RNA extraction, 815-816 

targeted gene deletion and conditions, 
fusion PCR, 804-808 
murine infection model, assaying pathogenesis 

evaluations, 821 

genomic DNA, 823 

inocculum preparation and intravenous 
infection, 820 

intranasal infection, 818-820 

melanization, 826—828 

monitoring disease progression, 820-821 
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Human pathogen cryptococcus neoformans (cont.) 

STM score calculation, 822, 824 

STM screening, 821,824 

strain and age, 818 

tissue culture, macrophages, 825-826 

virulence factors, 825, 828 
serotypes, strains, and sequences, 799-800 
Hydrophilic interaction chromatography 

(HILIC), 409-410 



Immobilized metal affinity chromatography 

(IMAC), 323-324, 701-702 
Integral membrane proteins (IMP) 
cell growth, 699 
characterization, 705 
expression and purification steps, 696 
membrane preparation and solubilization 

cell lysing, 699-700 

solubilizing detergents, 700—701 
molecular biology 

LIC, 697-698 

p423-GALl expression plasmid, 698 

vectors, 697 
purification 

IMAC, 701-702 

ion-exchange chromatography, 702—703 

PDLC, 704 

SEC, 702-704 
Integrated Genome Browser (IGB), 96 
In vitro assays, ERAD 

cytoplasmic aggresomes, 664-665 
degradation, paF 

cytosol and ATP-dependent degradation, 
reconstitution, 669—671 

ppaF and AGppaF, translocation, 669 
integral membrane proteins 

CFTR, 673 

retrotranslocation, 676—677 

in vitro ubiquitination assay, 673-676 
microsomes, 664 

paF retrotranslocation assay, 671-673 
paF soluble substrate 

ATP regenerating system, 668-669 

AGppaF isolation, 667 

microsome preparation, 666 

yeast cytosol, 667-668 
pre-pro-alpha factor (ppaF), 663 
protease, 662—663 
translocation, 662 
In vitro ubiquitination assay 

ubiquitination reaction, 674—676 
yeast cytosol and I-labeled ubiquitin, 673-674 
yeast microsomes isolation, 673—674 
Ion suppression, 407 
Isotope-coded affinity tag assay, 267 
iTRAQ labeling, 267 



Jackknife-based approach, 461 



Large-scale genetic interaction mapping, 147 
LC-MS. See Liquid chromatography-mass 

spectrometry 
Ligation independent cloning (LIC), 697-698 
Limma software package, 42 
Lipid analysis 

ESI-MS and ESI-MS/MS 

characterization, tandem mass 

spectrometry, 384 
MRM, 385 
samples, 377-379 
single stage profiling, 379-384 
targeted profiling and semiquantitative, 
384-385 
Liquid chromatography (LC) 
cationic nature, 409 
HILIC method, 409-410 
metabolite extract, 407-408 
polar analytes retention, 408—409 
reversed-phase, 408 
Liquid chromatography-mass spectrometry 
(LC-MS) 
benefits and response factor, 395 
core components, 407 
interface, 407 
labeling kinetics, 397 
metabolites 

absolute quantitation, 396-397 
changes, 395-396 
unanticipated, 396 
steady state isotope, 397-398 
Locally weighted scatterplot smoothing 

(LOWESS), 16 
LOWESS normalization algorithm, 70 
LTQ-Orbitrap Velos, 270 
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Markov Cluster Algorithm (MCL), 40 
Mass spectrometry (MS) 
analysis 

data-dependent mode, 276 
equipment, 275—276 
experiment data, 276—277 
sample preparation/injection, 276 
high resolution mass analyzers, 413-415 
hybrid, 265-266 
instruments, hybrid, 415 
protein in-solution digest, 274 
triple quarupole, 411—413 
MasterMap, 330 
MATa strains, 342 
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MATLAB toolbox software 

colony area size measurement, 221—223 
incorrect strain identification, 224 
Matrix assisted laser desorption ionization 

(MALDI), 263-264, 371 
MaxQuant software, 277 
MCL. See Markov Cluster Algorithm 
Mean-squared displacement (MSD) analysis 
curve, 556 
distance, 557 
free diffusion, 554-556 
Metabolic labeling 
complication, 268 
lipid analysis, 371 
SILAC pairs, 267-268 
trypsin, 267 
Metabolites. See also Liquid 

chromatography-mass spectrometry 
chemical derivatization 

amino acid, 405-406 

thiol and disulfide, 406 
extraction 

cells on filters, 404 

harvesting, 402-403 

methanol quenching, cells, 404-405 

two cell harvesting methods, 403 
Microarray data analysis and visualization 
cluster analysis 

hierarchical, 38-40 

network-based approaches, 40—41 
data management 

persistence and integrity, 44 

replication and multiuser support, 45 

web-based access, 45-46 
designing goals, 23 
differential expression 

Limma software package, 42 

significance analysis, 41-42 
gene sets 

GO term mapping, 42—43 

motif searching, 43 

network visualization, 43 
hybridization 

competitive, 24-25 

selection, 25—26 

single-sample, 24 
image analysis 

data files, 27-28 

digital, 27 

steps, 26 
information flow, 20—21 
MIAME compliance, 46-48 
preprocessing 

bioconductor, 30-32 

goals, 29 

housekeeping genes normalization, 35 

intensity bias, 35 

median normalization, 34—35 



normalization, ratio values, 33 

ratio value calculation, 32-33 

software tools, 29-30 

spreadsheet applications, 30 

statistics packages, 30 
quality assessment 

array, 36-37 

spot, 36 

utilization, array and spot, 37 
software tools 

commercial packages, 28 

open sources, 28-29 
terms and definitions, 22-23 
Microarray data confirmation 
strains deletion, 250-252 
96-well plate format, 250 
Microarrays, mRNA splicing 
analysis, data 

data normalization and 
replication, 70 

intron-containing genes, 71 

methodologies, 72-73 
data collection 

Agilent, 68 

Cy3 and Cy5, 69 
design 

Agilent platform, 55 

exon, 54-55 

intron-containing genes, 54 

junction probe, 55 

oligonucleotide, 53 
hybridization 

materials, 66 

protocol, 66-67 
washing 

materials, 67 

protocol, 67—68 
Microarray washing, 13 
Miniaturization, 493 
mRNA 

definition, 442 
hybridizing probes 

antibleach solution and enzyme preparation, 
440-441 

preparation and wash buffers, 439 

sample imaging, 441—442 

yeast cells, solution, 440 
in situ hybridization methods, 431-432 
spot detection, 442-443 
STL1 detection, NaCl shock, 444-445 
mRNA expression analysis, FISH 
acquisition, image 

microscope, 654 

sensitive, 653 
image analysis 

low-intensity signals, 656 

maximum projection, 654 

nascent transcripts, 657 
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mRNA expression analysis, FISH (cont.) 

signal intensities, quantification, 655 

spot detection, 656-657 
labeling efficiency 

extinction coefficients, 647 

molecular weight, oligo, 646 
life cycle, steps, 642 
materials 

cell fixation, preparation and storage, 648 

hybridization, 650-651 

probe labeling, 645 
probe design 

hybridization conditions, 643 

oligos, 645 

sequence homology, 644 
protocol 

cell fixation, preparation and storage, 
648-649 

hybridization, 652-653 

probe labeling, 646 
MS. See Mass spectrometry 
MS based metabolomics, yeast 
culture conditions 

batch liquid and chemostats, 398-400 

filter, 401-402 

methanol quenching, yeast harvesting, 
400-401 

yeast growth, filters atop agarose support, 402 

yeast harvesting, vacuum filtration, 400 
electrospray ionization, 410-411 
experimental design, 395-398 
high-resolution mass analyzers, 413-415 
hybrid instruments, 415 
LC. See Liquid chromatography 
LC-MS. See Liquid chromatography-mass 

spectrometry 
metabolites 

chemical derivatization, 405-406 

extraction, 402-405 
strain, 398 

targeted data analysis, 415-417 
triple quadrupole, 411—413 
untargeted data analysis 

adduct, 418-420 

in-source fragmentation, 421 

isotopic variants, 420 
Multicopy suppression profiling (MSP), 236 
Multiple reaction monitoring (MRJV1) 
experiments, 264—265 
lipids quantification, 385, 386 
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Nonessential query strain 

gene deletion marker switch method, 149—150 
PCR-mediated gene deletion, 149 

Nucleosomes in yeast, genome-wide mapping 
deep sequencing 

DNA ligation kit, 116 



freeze-n-squeeze, 117 

solexa/illumina, 115-116 
mononucleosomal DNA, isolation, 107—111 
protein-binding sites, 105-106 
tiling microarray analysis 

labeling and hybridization, 113 

protocol, 114 
titration level, purification 

chromatin characteristics, 112-113 

steps, 112 
Nyquist-Shannon sampling, 592-593 
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Oligonucleotides, 139 
Oligo Wiz, 54 
Optical density (OD) , 9 
Orthogonal binary interaction assays 
Well-NAPPA, 311-312 
YFP-PC A, 310-311 
Orthogroups 
categories 

age, 468 

IFH1 andCRFl,466, 467 
copy number variation profile 

ECVPs, 471-472 

interaction networks, ECVPs, 472—475 

volatility, 475-477 
curated resource comparison, synergy 

paralogy assignments, 463-464 

prediction, 463-464 

synteny, 465 

YGOB's annotations, 464-465 
definition, 452-453 
functional dichotomy, 476 
fungal robustness 

duplication and loss events, 463 

jackknife-based approach, 461 

nonsingleton, 462—463 

sound and complete, 462 
identification, synergy 

algorithm, 454-455 

candidate, breaking, 458-459 

gene similarity graph, 459 

matching candidate, 456 

phylogenetic tree reconstruction, 456-457 

tree rooting, 457-458 
projection and gene sets 

Fisher's exact test, 470 

history and functions, 468-470 

interaction network, 471 

5. cerevisiae genes, 470—471 
quality evaluation, Ascomycota 

identification and reconstruct gene trees, 
460-461 

WGD, 459-460 
simulated 

parameterization, 465-466 

synergy's accuracy, 465 
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PAM. See Prediction Analysis for Microarrays 
Paralogous genes 

conserved interactions, divergence 

annotation, 479 

computation statistics, 479-480 

pair partition, 480 
functional divergence estimation 

gene set annotations, 478 

shared regulatory mechanisms, 478-479 
P-bodies and stress granules, Saccharomyces cerevisiae 
growth and induction conditions, 621 
mammalian cells, 620 
markers 

core, protein, 622 

fluorescent fusion protein, 626 

foci distinct, 624-625 

protein component, 623—625 

subcellular distribution, 625-626 
messenger RNA monitoring 

binding sites, 631 

FISH technique, 630-631 

plasmids, 632 
mutation/perturbation, size, and number 

alterations, 633-634 

effects dissection, 634 

observation conditions, 631, 633 
quantification, size, and number 

cytoplasmic protein, 635 

manual, 637 

semiautomated, 636-637 
sample preparation 

core P-body, 627 

immunofluorescence methods, 628 

mid-log glucose-deprived cells, 628-629 

optimal yeast fluorescence microscopy, 
629-630 

subcellular distribution, 626 
translational repression, 622 
PBS. See Phosphate-buffered saline 
PCAs. See Protein-fragment complemtation assays 
PCR-mediated gene deletion, 149 
Peptide sequence similarity score 
description, 453 

vs. synteny similarity score, 453-454 
Phenotypes, yeast 
diversity 

calculation, 522-524 

cellular property, 520 

mutant libraries evaluation, 521-522 

transformation efficiency, 520—521 

unique cellular phenotypes, 519-520 
selection strategies 

library creation and maintenance, 525 

liquid vs. solid media, 525-526 

and post selection screening, 526-528 
Phosphate-buffered saline (PBS), 821 



Phosphopeptide isolation 
IMAC, 323-324 
mass spectrometric analyse 

CID MS2 spectra, 327 

LC system, 326 

LTQ-Orbitrap, 326-327 
phosphoramidate chemistry 

amino-derivatized dendrimer, 325-326 

glass beads, 324-325 
Ti0 2 resin, 323 
Phylogenetic gene tree, 449 
Pin tool sterilization procedures 
BioMatrix robot, 151-152 
manual, 150-151 

singer RoToR bench top robot, 151 
Plasmid library construction 
PCR, mutagenesis 

DNA concentration, 516 

GeneMorph II, 515 

megap rimers concentration, 517 

nicked plasmid and DNA, 517-518 

plasmid, 515-516 

purification, megaprimers, 516-517 
promoter selection 

biomass formation, 514-515 

coding sequence insertion, 513-514 

constitutive expression, 513 

mutagenized transcription factor, 
512-513 

transcription factor copy, 514 
sequence diversity and library maintenance 

bacterial colonies, 518 

cell culture, 519 

statistical model, 518-519 
PMB. See Pombe minimal glutamate 
Polyadenylation, 129-131 
Polymerase chain reaction (PCR) 
amplification, 188 
mutagenesis, 198 

products combination and concentration, 190 
Pombe epistatic mapper (PEM) protocol 
drug concentrations, 219 
genotypes, 219 
growth media, 219 
plates nomenclature, 219 
procedure 

GC1-GC2, 220 

GC2-GNC, 220 

library arrays preparation, 219-220 

mating, 220 

query arrays preparation, 219 

SPAS-GC 1,220 
Pombe minimal glutamate (PMB), 761 
Pooled fitness assay methodology 
array normalization, 248 
drug dosage determination, 238—240 
experimental pool growth, 240—241 
hybridization, 244-245 
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Pooled fitness assay methodology (cont.) 
log2 ratio calculation, 249 
microarray data confirmation, 250-252 
MSP pool construction, 238 
MSP ratios calculation, 249 
outlier masking, HIP and HOP, 246 
purification and amplification, barcodes and 

ORFs, 241-243 
saturation correction, HIP, HOP, and MSP, 

246-248 
unusable tags removal, 248-249 
yeast deletion strain pool construction, 
237-238 
PPIs. See Protein-protein interactions 
Prediction Analysis for Microarrays (PAM), 40 
Pre-mRNA splicing 
data, microarray 

analysis, 69-73 

collection, 68-69 
genome-wide changes, 73 
microarray design 

Agilent platform, 55 

exon, 54-55 

intron-containing genes, 54 

junction probe, 55 

oligonucleotides, 53 
sample preparation 

cDNA synthesis, 60-64 

cell collection, 56—57 

fluorescent labeling, cDNA, 64—66 

microarray hybridization, 66—67 

RNA isolation, 57-60 

washing, microarray, 67-68 
Promoter/terminator clone construction 
PCR amplification, 188 
structure, 186-187 

T4 DNA polymerase resection reaction, 1 88-1 89 
Protein-detergent-lipid complex (PDLC), 704, 705 
Protein-fragment complemtation assays (PCAs) 
binary /proximal interactions, 337—338 
conceptual basis, 336-337 
DHFR survival selection, PPIs 

experimental procedure, 342-344 

high-throughput screening, 339-340 

materials, 341 

oligonucleotide cassettes, 340 

prey and bait starins, 341-342 

principle, 339 

screen analysis, large-scale, 345-348 
GFP family fluorescent protein, PPIs 

cell preparation, 361 

competent yeast cotransformation, 359-361 

features, 356 

plasmids, 359 

reagents and equipment, 358 

result, 365 

timeline, 361 

and YFP PCAs, 356-357 



life and death selection, PPIs 

death selection screen, 352-353 

experiment preparation, 351—352 

facultative and equipment, 351 

fcyl deletion strains, 348-350 

Gateway cloning system, 354, 356 

OyCD,348 

reagents, 350-351 

survival-selection screen, 353-354 

timeline, 354, 355 

two-step OyCD PCA screen preparation, 350 
luciferases reporter, PPIs 

cell preparation, bioluminescence assay, 
364-366 

competent yeast cotransformation, 363-364 

equipment and plasmids, 363 

fragments fusion, chromosomal loci, 364 

reagents, 362 

Rluc and Glue, 362-365 

timeline and results, 366 
N-and C-terminal fragments, 339 
PPI, 338-339 
sensitivity, 338 
Protein phosphorylation 
data analyses 

database search, 329 

quantification, 329-330 

regulation statistical significance, 330-331 
kinases and phosphatases, 318 
peptide sample generation 

medium, growth conditions and harvest, 
yeast, 321 

phosphopeptide isolation, 320 

purification, 322 

trichloroacetic acid (TCA), 319—320 

yeast cell lysis, 321 
phosphopeptide isolation 

IMAC, 323-324 

mass spectrometric analyses, 326-328 

phosphoramidate chemistry, 324-326 

Ti02 resin, 323 
phosphoproteome state and MS approach, 
319, 320 
Protein-protein interactions (PPIs). See also 

Protein-fragment complemtation assays 
DHFR PCA survival selection 

high-throughput screening, 339-340 

materials, 341 

oligonucleotide cassettes, 340 

procedure, 341-348 
PCAs 

GFP family fluorescent protein, 356-365 

life and death selection, dissection, 348-356 

luciferases reporter, 362-366 
Protein transformation protocol, yeast prion 

particles 
conversion efficiency and prion strain 
phenotypes 
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Ade + colonies, 688-689 

distinct strain generation, 691 

fibers, 689 

multiple strains generation, 690 
induction, 687 
lyticase preparation, 686 
Sup-NM 

preparation, 684-685 

purification, 683-684 
Sup35p, 682 

URA3 plasmid, 686, 688 
in vivo prions preparation, 685-686 



Qiagen QIAquick gel extraction kit, 88 
Quadrupole time-of-flight (QTOF), 326 

R 

Random spore analysis (RSA), 770 
Raw colony data statistical analysis 
baits and preys, 346-347 
protein-fragment complementation, 346 
steps 

plates growth rate, 347 
true/false positive interactions, 347-348 
Renilla luciferase (Rluc), 362-365 
Repeat induced point (RIP), 846 
Ribosome footprint generation 
cycloheximide addition, 121 
extract preparation 

cell freezing steps, 122-123 
yeast lysis, 123 
fragment purification process 
aqueous phase, 124-125 
filtration, 125 
size selection, 125-126 
mRNA preparation 

deep sequencing measurements, 126—127 
fragmentation reaction, 127—128 
purification from RNA, 127 
nuclease digestion and monosome purification, 

123-124 
procedure, 121—122 
RIP. See Repeat induced point 
RNA FISH 

mRNA hybridization 

antibleach solution and enzymes 

preparation, 440-441 
preparation and wash buffers, 439 
samples imaging, 441-442 
stringency, 438-439 
yeast cells, solution, 440 
mRNA spot detection, 442-443 
oligonucleotide 

designing, 432-433 
fluorophore coupling, 433-435 
purification, HPLC, 435-437 



5. cerevisiae fixing, 437-438 
STL1 mRNA detection, 444-445 
RSA. See Random spore analysis 



Schizosaccharomyces pombe, molecular genetics 
biology, growth, and maintenance 

adenine and low-ade media and storage, 766 

conjugation and sporulation, media, 765 

Edinburgh minimal media, 761 

liquid media, 766-767 

media formulations, 762—764 

phloxin B, 761, 766 
bulk spore germination, 771 
cell synchrony 

block and release, 775 

G1/G0 arrest, nitrogen starvation, 775-776 

lactose gradient centrifugation, 774-775 

septation index, 773 

S-phase arrest, 774 
colony PCR, 781-783 
crosses, genetic, 768-769 
diploid isolation, 772-773 
DNA and septum staining, cell, 787-788 
fixatives immunofluorescence, cell, 788-791 
flow cytometry 

nuclear ghosts, 786-787 

whole cells, 785-786 
integrations, 778-779 
life cycle, 767 

mating and sporulation, 768 
mating type testing, 769—770 
novel mutation isolation, 779 
plasmids, 776-778 
preparation 

DNA, 781 

RNA, 783 
RSA, 770 

soluble protein extract, 783-784 
TCA protein extraction, 784-785 
tetrad dissection, 770-771 
transformation, DNA 

electroporation, 779-780 

lithium acetate, 780 
SDA. See Strand displacement amplification 
SDD— AGE. See Semidenaturing 

detergent-agarose gel electrophoresis 
SD systems. See Spinning-disk systems 
Selected reaction monitoring (SRM), 412 
Self-organizing maps (SOMs), 40 
Semidenaturing detergent-agarose gel 

electrophoresis (SDD-AGE), 713, 717, 719, 

721 
Sequencing data management, 93-94 
Sequencing library preparation 
circularization, 130 
optimization structure, 128-129 
partial alkaline hydrolysis, 128 
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Sequencing library preparation (cont.) 
PCR amplification 

DNA quantification, 133 
gel purification, 132-133 
preparation, PCR mixes, 131-132 
polyadenylation, 129-131 
reverse transcription, 130, 132 
Serial dilution, 490-491 
SGA. See Synthetic genetic array 
SGAM. See SGA mapping 
SGA mapping (SGAM), 175 
Shotgun approaches 
ESI, 263 
ETD, 264 
ion traps, 265 
LTQ-Orbitrap, 265-266 
MALDI, 263-264 
metabolic labeling, 267-268 
MRM, 264-265 
peptide ions separation, 262-263 
quadrupoles, 264 
Signature-tagged mutagenesis (STM), 821-824 
SILAC 

culture, yeast, 273 
extract prepartion, 273-274 
incorporation, 274—275 
labelling media, 272-273 
yeast strains, 272 
Single mRNA molecules imaging 
fluorescent protein, 430-431 
RNA FISH 

fluorophore coupling, oligonucleotide, 

433-435 
mRNA hybridization, 438-442 
mRNA spot detection, 442-443 
oligonucleotide designing, 432-433 
probe purification, HPLC, 435-437 
5. cerevisiae fixing, 437-438 
in situ hybridization comparison, 431-432 
STL1 mRNA detection, NaCl shock, 
444-445 
Size-exclusion and ion-exchange 

chromatography (SEC), 701—704 
SOMs. See Self-organizing maps 
Spinning-disk confocal microscope, yeast 
base and scanhead, 585 
cameras 

EMCCDs, 591-592 
magnification, 593 
read noise, 591 
sensitivity, 590 
components, 583-584 
lasers and niters, 585-586 
multiple pinholes, 583 
objective 

correction collars, 589-590 
oil-immersion, 587 
resolution, 586 
spherical aberration, 587-588 



water-immersion, 590 
optical sectioning, 582 
out-of-focus light, 582-583 
sample preparation 

autofluorescence minimization, 597-598 
fluorescent tagging, 594-597 
mounting, 598-599 
software and system integration, 594 
Z-stack acquisition, 593 
Spinning-disk (SD) systems 

microscope systems, comparison, 547 
rapid time-lapse imaging, 545 
5. pombe SGA (SpSGA) 

media and stock solutions, 171 
procedure 

384-formatted array plate preparation, 167 
384-formatted query plate preparation, 

165-167 
query mating and array, 1 67—1 68 
RoToR preparation, 165 
Sporulation, 192, 200 
SpSGA. See S. pombe SGA 
State-of-the-art proteomics analysis, 261 
Sterol 

analysis, 385-389 
isolation 

cell extraction, 376—377 
solid phase extraction, 377 
STL1 mRNA detection, 444-445 
STM. See Signature-tagged mutagenesis 
Strand displacement amplification (SDA), 

754-755 
Synergy 

description, 451 
gene similarity 
graph, 454 
scoring, 453-454 
orthogroups 

curated resource comparison, 463-465 
definitions, 452-453 
fungal robustness, 461-463 
identification, 454-459 
simulated, 465-466 
Synteny similarity score 
description, 453 

vs. peptide sequence similarity score, 453-454 
Synthetic genetic array (SGA) 
analysis, 147-148 
application 

chemical genomics, 175—176 
combination and gene overexpression, 

174-175 
gene and genetic interactions, 174 
integration and HCS, 171-174 
SGAM, 175 
cell genetic landscape, 163 
data processing, 156 
1536-density DMA construction, 152 
double mutant array, 155—157 
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genetic interactions 

chemical, 176 

interpretation and analysis, 159—165 

properties, 161 

quantitative scoring, 157-159 

Saccharomyces cerevisae, 146-147 
HCA pipeline, 173 
media and stock solutions, 168-170 
methodology, 154 
pin tool sterilization 

BioMatrix robot, 151-152 

manual, 150-151 

singer RoToR bench top robot, 151 
procedure, 153-155 
query strain construction 

gene deletion marker switch method, 149—1 50 

PCR-mediated gene deletion, 149 
SpSGA 

384-formatted array plate preparation, 167 

384-formatted query plate preparation, 
165-167 

media and stock solutions, 171 

query mating, 167—168 

RoToR preparation, 165 
systematic effect correction, 160 
Synthetic genetic array (SGA) protocol 
drug concentrations, 216 
genotypes, 214 
growth media, 214—216 
plates nomenclature, 216—217 
procedure 

diploid selection, 217 

HS1 andHS2, 218 

library arrays preparation, 217 

mating, 217 

query lawn preparation, 217 

SM and DM, 218 

sporulation, 217—218 
Systematic effects and normalization procedures, 

158-160 



Tagged image file format (TIFF), 27 
Tagging chromatin in vivo 
DNA locus, colocalization 

fluorescence, 551 

pore cluster, 552 
image acquisition 

high-precision widefield microscopy, 
543-545 

laser-scanning microscope, 543 

microscopic system, 542 
immobilizing cells 

agarose, 540-541 

live imaging, 541—542 
lacO/tetO repeat plasmid, 539-540 
live cell time-lapse analysis 



laser-scanning microscopy, 545-546 
SD systems, 546-547 
locus mobility, quantification 
Brownian motion, 552—553 
MSD analysis, 554-557 
spot tracking, 553 
track length and step size, 554 
nucleus, position determination, 540 
steps, 537 

tagged locus, 3D position determination 
Cavalieri's principle, 549 
decapping, 551 
ellipsoid, 548 
reconstruction, 547—548 
subnuclear localization, 550 
temperature control, 542 
Tandem affinity purification (TAP), 747, 748 
Tandem mass spectrometry (MS/MS), 378 
Targeted data analysis 
intensity, 417 

ion-specific chromatogram, 416—417 
retention time and signal intensity, 
415-416 
TCA. See Trichloroacetic acid 
Temperature-sensitive (Ts) mutants 

advantages, "diploid shuffle" methods, 

202-203 
characteristics, 182 
diploid shuffle-chromosome method 
description, 194—196 
materials, 196-198 
methods, 198-202 
diploid shuffle-plasmid method 
description, 183—185 
materials, 185-186 
methods, 186-194 
disadvantages, 203 
methodologies, 183 
Thin layer chromatography (TLC), 371 
TIFF. See Tagged image file format 
Topo-TA plasmid usage, 196 
Transcription factor binding sites (TFBS) 
ChlP-seq vs. ChlP-chip, 79 
identification, 80 
neighboring genes, 101 
scoring process, 85 
Transcription factor selection 

complexity and relevant players, 512 
machinery component, 511—512 
Transcriptome engineering 
phenotypes, selection 

liquid vs. solid media, 525-526 
membrane remodeling and osmoadaptive 

mechanisms, 524-525 
strategies and postselection screening, 

526-528 
validation, 528-529 
yeast, creation and maintenance, 525 
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Transcriptome engineering (cont.) 
phenotypic diversity assessment 
calculation, 522-524 
mutant library evaluation, 521-522 
yeast transformation efficiency, 520-521 
plasmid library construction 

PCR, random mutagenesis, 515-518 
promoter selection, 512-515 
total sequence diversity and library 
maintenence, 518—519 
transcription factor selection 

complexity and relevant players, 512 
mutagenesis, 511-512 
Trichloroacetic acid (TCA), 784-785 
Triple quarupole mass spectrometry 
quadrupole function, 411—412 
utilization, 412 
Ts mutants, see Temperature-sensitive mutants 
Turbidostat, 492-493 



U 



Ubiquitin-protesome system (UPS) 
aggresomes, 665 
misfolded membrane, 663 
Ultradian metabolic cycles, yeast 

chemostat use, oxygen consumption 
cell population, 859-860 
CEN.PK strain, 858-859 
growth media, long-period cycle, 860-861 
circadian rhythms, 857-858 
long-period cycles, 858, 860, 862 
short-period cycles, 862 
significance, condition 
log phase growth, 863 
steady-state growth, 864-865 
Unfolded protein response (UPR), 211 
Untargeted data analysis 
adduct, 418-420 
in-source fragmentation, 421 
isotopic variant, 420 
Upstream open reading frames (uORFs), 121 



W 



Well-NAPPA, 311-312 

Whole genome duplication (WGD), 459-460, 

840, 844, 845 
Wild-type (WT) strains, 4 



Yeast cells transformation, 191, 199-200 
Yeast chromosomes and nuclear architecture 
IF and FISH, fixed samples 

antibody dissociation, 563-564 

antibody purification and specificity, 558-559 

antibody treatment, 562-563 

cell permeabilization, 562 



DNA visualization, 565 

fixation, 560-561 

fluorophores choice, 559-560 

live imaging, 557 

in situ hybridization, 557—558, 563 

spheroplasting, 561-562, 566 

yeast strains and media, 558 
live microscopy, 536 
tagging chromatin in vivo 

DNA locus, colocalization, 551-552 

image acquisition, 542-547 

immobilizing cells, 540—542 

lacO/tetO repeat plasmid, 539-540 

live cell time-lapse analysis, 545—547 

locus mobility, quantification, 552-557 

nucleus, position determination, 540 

PCR-generated fragment cloning, 537, 539 

site-specific integration, 538 

steps, 537 

tagged locus, 3D position determination, 
547-551 

temperature control, 542 
Yeast lipid, MS 
analysis 

ESI-MS and ESI-MS/MS, 377-385 

sterol, GC-MS, 385-389 
eukaryotic organisms classes, 370-371 
extraction 

alkaline methanolysis, 376 

synthetic standards, 375-376 
isolation, cellular milieu, 372 
material normalization, 375 
sample preparation, 373-375 
5. cerevisiae structure, 370 
soft ionization, 371-372 
sterol isolation, 376-377 
traditional methods, 371 
Yeast peptone adenine dextrose (YPAD) 
agar plates, 804 
biolistic transformation, 810 
culture, 828 
standard growth, 803 
Yeast prions, 682 
Yellow fluorescent protein complementation 

assay (YFP-PCA), 310-311 
Y2H datasets 

autoactivators, 286 

binary interactome maps biological evaluation, 

291-292 
candidate interaction retesting, 286-287 
controls, 287-288 
implementation, 287—288 
validation 

artifacts, 288-289 

experimental assessment, 290-291 

individual interactions, 291 

protein pairs, 289-290 
YPAD. See Yeast peptone adenine dextrose 



